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Preface to the Third 
Edition 


The field of inverse problems is growing rapidly, and during the 9 years since 
the appearance of the second edition of this book many new aspects and 
subfields have been developed. Since, obviously, not every subject can be 
treated in a single monograph, the author had to make a decision. As I 
pointed out already in the preface of the first edition, my intention was—and 
still is—to introduce the reader to some of the basic principles and devel- 
opments of this field of inverse problems rather than going too deeply into 
special topics. As I continued to lecture on inverse problems at the University 
of Karlsruhe (now Karlsruhe Institute of Technology), new material has been 
added to the courses and thus also to this new edition because the idea of this 
book is still to serve as a type of textbook for a course on inverse problems. 
I have decided to extend this monograph in two directions. For some readers, 
it was perhaps a little unsatisfactory that only the abstract theory for linear 
problems was presented but the applications to inverse eigenvalue problems, 
electrical impedance tomography, and inverse scattering theory are of a 
nonlinear type. For that reason, and also because the abstract theories for 
Tikhonov’s method and Landweber’s iteration for nonlinear equations have 
come to a certain completion, I included a new chapter (Chapter 4) on these 
techniques for locally improperly posed nonlinear equations in Hilbert spaces 
with an outlook into some rather new developments for Banach spaces. The 
former Chapters 4, 5, and 6 are moved to 5, 6, and 7, respectively. The 
additional functional analytic tools needed in this new Chapter 4 result in 
two new sections of Appendix A on convex analysis and weak topologies. 
As a second new topic, a separate section (Section 7.6) on interior 
transmission eigenvalues is included. These eigenvalue problems arise natu- 
rally in the study of inverse scattering problems for inhomogeneous media 
and were introduced already in the former editions of this monograph. 
Besides their importance in scattering theory, the transmission eigenvalue 
problem is an interesting subject in itself, mainly because it fails to be 
self-adjoint. The investigation of the spectrum is a subject of the present 
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research. Special issues of Inverse Problems [37] and recent monographs [34, 
55] have addressed this topic alreadyf for the study of complex eigenvalues, 
one is until now restricted to radially symmetric refractive indices which 
reduces the partial differential equations to ordinary differential equations. 
Classical tools from complex analysis make it possible to prove the existence 
of complex eigenvalues (Subsection 7.6.1) and uniqueness for the corre- 
sponding inverse spectral problem (Subsection 7.6.3). I think that this ana- 
logue to the inverse Sturm—Liouville problem of Chapter 5 is a natural 
completion of the study of interior transmission eigenvalues. 

Finally, a rather large number of mistakes, ambiguities, and misleading 
formulations has been corrected in every chapter. As major mistakes, the 
proofs of Theorems 4.22 (a) and 6.30 (d) (referring to the numbering of the 
second edition) have been corrected. I want to thank all of my colleagues and 
the readers of the first two editions for the overwhelming positive responses 
and, last but not least, the publisher for its encouragement for writing this 
third edition. 


Karlsruhe, Germany Andreas Kirsch 
December 2020 


Preface to the Second 
Edition 


The first edition of the book appeared 14 years ago. The area of inverse 
problems is still a growing field of applied mathematics and an attempt at a 
second edition after such a long time was a difficult task for me. The number 
of publications on the subjects treated in this book has grown considerably 
and a new generation of mathematicians, physicists, and engineers has 
brought new concepts into the field. My philosophy, however, has never been 
to present a comprehensive book on inverse problems that covers all aspects. 
My purpose was (as I pointed out in the preface of the first edition), and still 
is, to present a book that can serve as a basis for an introductory (graduate) 
course in this field. The choice of material covered in this book reects my 
personal point of view: students should learn the basic facts for linear 
ill-posed problems including some of the present classical concepts of regu- 
larization and also some important examples of more modern nonlinear 
inverse problems. 

Although there has been considerable progress made on regularization 
concepts and convergence properties of iterative methods for abstract non- 
linear inverse problems, I decided not to include these new developments in 
this monograph. One reason is that these theoretical results on nonlinear 
inverse problems are still not applicable to the inverse scattering problems 
that are my major field of interest. Instead, I refer the reader to the mono- 
graphs [92, 149] where regularization methods for nonlinear problems are 
intensively treated. 

Also, in my opinion, every nonlinear inverse problem has its own char- 
acteristic features that should be used for a successful solution. With respect 
to the inverse scattering problem to determine the shape of the support of the 
contrast, a whole class of methods has been developed during the last decade, 
sometimes subsumed under the name Sampling Methods. Because they are 
very popular not only in the field of inverse scattering theory but also in the 
field of electrical impedance tomography (EIT) I decided to include the 
Factorization Method as one of the prominent members in this monograph. 
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The Factorization Method is particularly simple for the problem of EIT and 
this field has attracted a lot of attention during the past decade, therefore a 
chapter on EIT has been added to this monograph as Chapter 5 and the 
chapter on inverse scattering theory now becomes Chapter 6. 

The main changes of this second edition compared to the first edition 
concern only Chapters 5 and 6 and Appendix A. As just mentioned, in 
Chapter 5 we introduce the reader to the inverse problem of electrical 
impedance tomography. This area has become increasingly important 
because of its applications in medicine and engineering sciences. Also, the 
methods of EIT serve as tools and guidelines for the investigation of other 
areas of tomography such that optical and photoacoustic tomography. 

The forward model of EIT is usually set up in the weak sense, that is, in 
appropriate Sobolev spaces. Although I expect that the reader is familiar with 
the basic facts on Sobolev spaces such as the trace theorem and Friedrich’s 
inequality, a tutorial section on Sobolev spaces on the unit disk is added in 
Appendix A, Section A.5. The approach using Fourier techniques is not very 
common but fits well with the presentation of Sobolev spaces of fractional 
order on the boundary of the unit disk in Section A.4 of Appendix A. In 
Chapter 5 on electrical impedance tomography the Neumann—Dirichlet 
operator is introduced and its most important properties such as 
monotonicity, continuity, and differentiability are shown. Uniqueness of the 
inverse problem is proven for the linearized problem only because it was this 
example for which Calderén presented his famous proof of uniqueness. 
(The fairly recent uniqueness proof by Astala and Paivarinta in [10] is far too 
complicated to be treated in this introductory work.) As mentioned above, the 
Factorization Method was developed during the last decade. It is a completely 
new and mathematically elegant approach to characterize the shape of the 
domain where the conductivity differs from the background by the Neumann— 
Dirichlet operator. The Factorization Method is an example of an approach 
that uses special features of the nonlinear inverse problem under consideration 
and has no analogy for traditional linear inverse problems. 

Major changes are also made in Chapter 6 on inverse scattering problems. 
A section on the Factorization Method has been added (Section 6.4) because 
inverse scattering problems are the type of problem for which it is perfectly 
applicable. The rigorous mathematical treatment of the Factorization 
Method makes it necessary to work with weak solutions of the scattering 
problem. Therefore, here we also have to use (local) Sobolev spaces rather 
than spaces of continuously differentiable functions. I took the opportunity to 
introduce the reader to a (in my opinion) very natural approach to prove 
existence of weak solutions by the Lippmann—Schwinger equation in 
L?(D) (where D contains the support of the contrast n — 1). The key is the 
fact that the volume potential with any L?-density solves the corresponding 
inhomogeneous Helmholtz equation in the weak sense (just as in the case of 
smooth densities) and can easily be proved by using the classical result and a 
density argument. The notion of weak solutions has the advantage of allowing 
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arbitrary L®-functions as indices of refraction but makes it necessary to 
modify almost all of the arguments in this chapter slightly. In Section 6.7 we 
dropped the motivating example for the uniqueness of the inverse scattering 
problem (Lemma 6.8 in the first edition) because it has already been pre- 
sented for the uniqueness of the linearized inverse problem of impedance 
tomography. 

Finally, I want to thank all the readers of the first edition of the monograph 
for their extraordinarily positive response. I hope that with this second edition 
IT added some course material suitable for being presented in a graduate course 
on inverse problems. In particular I have found that my students like the 
problem of impedance tomography and, in particular, the Factorization 
Method and I hope that this is true for others! 


Karlsruhe, Germany Andreas Kirsch 
March 2011 
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Following Keller [152] we call two problems inverse to each other if the 
formulation of each of them requires full or partial knowledge of the other. By 
this definition, it is obviously arbitrary which of the two problems we call the 
direct and which we call the inverse problem. But usually, one of the prob- 
lems has been studied earlier and, perhaps, in more detail. This one is usually 
called the direct problem, whereas the other is the inverse problem. 
However, there is often another more important difference between these two 
problems. Hadamard (see [115]) introduced the concept of a wellposed prob 
lem, originating from the philosophy that the mathematical model of a 
physical problem has to have the properties of uniqueness, existence, and 
stability of the solution. If one of the properties fails to hold, he called the 
problem ill-posed. It turns out that many interesting and important inverse 
problems in science lead to ill-posed problems, whereas the corresponding 
direct problems are well-posed. Often, existence and uniqueness can be forced 
by enlarging or reducing the solution space (the space of “models”). For 
restoring stability, however, one has to change the topology of the spaces, 
which is in many cases impossible because of the presence of measurement 
errors. At first glance, it seems to be impossible to compute the solution of a 
problem numerically if the solution of the problem does not depend contin- 
uously on the data, that is, for the case of ill-posed problems. Under addi- 
tional a priori information about the solution, such as smoothness and bounds 
on the derivatives, however, it is possible to restore stability and construct 
efficient numerical algorithms. 

We make no claim to cover all of the topics in the theory of inverse 
problems. Indeed, with the rapid growth of this field and its relationship to 
many fields of natural and technical sciences, such a task would certainly be 
impossible for a single author in a single volume. The aim of this book is 
twofold: first, we introduce the reader to the basic notions and difficulties 
encountered with ill-posed problems. We then study the basic properties of 
regularization methods for linear ill-posed problems. These methods can 
roughly be classified into two groups, namely, whether the regularization 
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parameter is chosen a priori or a posteriori. We study some of the most 
important regularization schemes in detail. 

The second aim of this book is to give a first insight into two special 
nonlinear inverse problems that are of vital importance in many areas of the 
applied sciences. In both inverse spectral theory and inverse scattering the- 
ory, one tries to determine a coefficient in a differential equation from mea- 
surements of either the eigenvalues of the problem or the field “far away” 
from the scatterer. We hope that these two examples clearly show that a 
successful treatment of nonlinear inverse problems requires a solid knowledge 
of characteristic features of the corresponding direct problem. The combi- 
nation of classical analysis and modern areas of applied and numerical 
analysis is, in the author’s opinion, one of the fascinating features of this 
relatively new area of applied mathematics. 

This book arose from a number of graduate courses, lectures, and survey 
talks during my time at the universities of G6ttingen and Erlangen/ 
Nirnberg. It was my intention to present a fairly elementary and complete 
introduction to the field of inverse problems, accessible not only to 
mathematicians but also to physicists and engineers. I tried to include as 
many proofs as possible as long as they required knowledge only of classical 
differential and integral calculus. The notions of functional analysis make it 
possible to treat different kinds of inverse problems in a common language 
and extract its basic features. For the convenience of the reader, I have 
collected the basic definitions and theorems from linear and nonlinear 
functional analysis at the end of the book in an appendix. Results on 
nonlinear mappings, in particular for the Fréchet derivative, are only needed 
in Chapters 4 and 5. 

The book is organized as follows. In Chapter 1, we begin with a list of 
pairs of direct and inverse problems. Many of them are quite elementary and 
should be well known. We formulate them from the point of view of inverse 
theory to demonstrate that the study of particular inverse problems has a 
long history. Sections 1.3 and 1.4 introduce the notions of ill-posedness and 
the worstcase error. Although ill-posedness of a problem (roughly speaking) 
implies that the solution cannot be computed numerically — which is a very 
pessimistic point of view — the notion of the worst-case error leads to the 
possibility that stability can be recovered if additional information is 
available. We illustrate these notions with several elementary examples. 

In Chapter 2, we study the general regularization theory for linear 
ill-posed equations in Hilbert spaces. The general concept in Section 2.1 is 
followed by the most important special examples: Tikhonov regularization in 
Section 2.2, Landweber iteration in Section 2.3, and spectral cutoff in 
Section 2.4. These regularization methods are applied to a test example in 
Section 2.5. While in Sections 2.1—2.5 the regularization parameter has been 
chosen a priori, that is before starting the actual computation, Sections 2.6— 
2.8 are devoted to regularization methods in which the regularization 
parameter is chosen implicitly by the stopping rule of the algorithm. In 
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Sections 2.6 and 2.7, we study Morozov’s discrepancy principle and, again, 
Landweber’s iteration method. In contrast to these linear regularization 
schemes, we will investigate the conjugate gradient method in Section 2.8. 
This algorithm can be interpreted as a nonlinear regularization method and 
is much more difficult to analyze. 

Chapter 2 deals with ill-posed problems in infinite-dimensional spaces. 
However, in practical situations, these problems are first discretized. The 
discretization of linear ill-posed problems leads to badly conditioned finite 
linear systems. This subject is treated in Chapter 3. In Section 3.1, we recall 
basic facts about general projection methods. In Section 3.2, we study several 
Galerkin methods as special cases and apply the results to Symm’s integral 
equation in Section 3.3. This equation serves as a popular model equation in 
many papers on the numerical treatment of integral equations of the first 
kind with weakly singular kernels. We present a complete and elementary 
existence and uniqueness theory of this equation in Sobolev spaces and apply 
the results about Galerkin methods to this equation. In Section 3.4, we study 
collocation methods. Here, we restrict ourselves to two examples: the moment 
collocation and the collocation of Symm’s integral equation with trigono- 
metric polynomials or piecewise constant functions as basis functions. In 
Section 3.5, we compare the different regularization techniques for a concrete 
numerical example of Symm’s integral equation. Chapter 3 is completed by 
an investigation of the Backus—Gilbert method. Although this method does 
not quite fit into the general regularization theory, it is nevertheless widely 
used in the applied sciences to solve moment problems. 

In Chapter 4, we study an inverse eigenvalue problem for a linear ordi- 
nary differential equation of second order. In Sections 4.2 and 4.3, we develop 
a careful analysis of the direct problem, which includes the asymptotic 
behaviour of the eigenvalues and eigenfunctions. Section 4.4 is devoted to the 
question of uniqueness of the inverse problem, that is, the problem of 
recovering the coefficient in the differential equation from the knowledge of 
one or two spectra. In Section 4.5, we show that this inverse problem is 
closely related to a parameter identification problem for parabolic equations. 
Section 4.6 describes some numerical reconstruction techniques for the 
inverse spectral problem. 

In Chapter 5, we introduce the reader to the field of inverse scattering 
theory. Inverse scattering problems occur in several areas of science and 
technology, such as medical imaging, nondestructive testing of material, and 
geological prospecting. In Section 5.2, we study the direct problem and prove 
uniqueness, existence, and continuous dependence on the data. In Section 5.3, 
we study the asymptotic form of the scattered field as r — oo and introduce 
the far eld pattern. The corresponding inverse scattering problem is to 
recover the index of refraction from a knowledge of the far field pattern. We 
give a complete proof of uniqueness of this inverse problem in Section 5.4. 
Finally, Section 5.5 is devoted to the study of some recent reconstruction 
techniques for the inverse scattering problem. 


xiv Preface to the First Edition 


Chapter 5 differs from previous ones in the unavoidable fact that we have 
to use some results from scattering theory without giving proofs. We only 
formulate these results, and for the proofs we refer to easily accessible 
standard literature. 

There exists a tremendous amount of literature on several aspects of 
inverse theory ranging from abstract regularization concepts to very concrete 
applications. Instead of trying to give a complete list of all relevant contri- 
butions, I mention only the monographs [17, 105, 110, 136, 168, 173, 174, 175, 
182, 197, 198, 215, 263, 264], the proceedings, [5, 41, 73, 91, 117, 212, 237, 
259], and survey articles [88, 148, 152, 155, 214] and refer to the references 
therein. 

This book would not have been possible without the direct or indirect 
contributions of numerous colleagues and students. But, first of all, I would 
like to thank my father for his ability to stimulate my interest and love of 
mathematics over the years. Also, I am deeply indebted to my friends and 
teachers, Professor Dr. Rainer Kress and Professor David Colton, who 
introduced me to the field of scattering theory and inuenced my mathemat- 
ical life in an essential way. This book is dedicated to my long friendship with 
them! 

Particular thanks are given to Dr. Frank Hettlich, Dr. Stefan Ritter, and 
Dipl.-Math. Markus Wartha for carefully reading the manuscript. 
Furthermore, I would like to thank Professor William Rundell and Dr. 
Martin Hanke for their manuscripts on inverse Sturm—Liouville problems and 
conjugate gradient methods, respectively, on which parts of Chapters 4 and 2 
are based. 


Karlsruhe, Germany Andreas Kirsch 
April 1996 
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Chapter 1 


Introduction and Basic 
Concepts 


1.1 Examples of Inverse Problems 


In this section, we present some examples of pairs of problems that are inverse 
to each other. We start with some simple examples that are normally not even 
recognized as inverse problems. Most of them are taken from the survey article 
[152] and the monograph [111]. 


Example 1.1 
Find a polynomial p of degree n with given zeros 71,...,2,. This problem is 
inverse to the direct problem: Find the zeros x1,...,2%, of a given polynomial 


p. In this example, the inverse problem is easier to solve. Its solution is p(x) = 
c(w@ — @1)...(@ — @,) with an arbitrary constant c. 


Example 1.2 

Find a polynomial p of degree n that assumes given values y1,...,Y%, € R at 
given points 71,...,2%, € R. This problem is inverse to the direct problem of 
calculating the given polynomial at given 21,...,2%,. The inverse problem is 
the Lagrange interpolation problem. 


Example 1.3 

Given a real symmetric n xX n matrix A and n real numbers j,...,An, find 
a diagonal matrix D such that A+ D has the eigenvalues \1,...,An. This 
problem is inverse to the direct problem of computing the eigenvalues of the 
given matrix A+ D. 


Example 1.4 

This inverse problem is used with intelligence tests: Given the first few terms 
@1,02,...,@,% of a sequence, find the law of formation of the sequence; that is, 
find a, for all n! Usually, only the next two or three terms are asked for to show 
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that the law of formation has been found. The corresponding direct problem is 
to evaluate the sequence (a,,) given the law of formation. It is clear that such 
inverse problems always have many solutions (from the mathematical point of 
view), and for this reason their use on intelligence tests has been criticized. 


Example 1.5 (Geological prospecting) 

In general, this is the problem of determining the location, shape, and/or some 
parameters (such as conductivity) of geological anomalies in the Earth’s inte- 
rior from measurements at its surface. We consider a simple one-dimensional 
example and describe the following inverse problem. 

Determine changes p = p(x), 0 < x < 1, of the mass density of an anomalous 
region at depth h from measurements of the vertical component f,(x) of the 
change of force at x. p(a’)Aa’ is the mass of a “volume element” at x’ and 

(a — x’)? + h? is its distance from the instrument. The change of gravity is 
described by Newton’s law of gravity f = 772 with gravitational constant ¥. 
For the vertical component, we have 


p(a')Aa’ 
Af. (x) = 7 oe cos@ = 7 


h p(a') Aa! . 
[(@ — a’)? + n°? 


This yields the following integral equation for the determination of p: 


x—w2') 


1 
(x') / 
fo(z) = yh E 7g de’ forO<a<1. (1.1) 
| [( 2 +h]? 2 


We refer to [6, 105, 277] for further reading on this and related inverse problems 
in geological prospecting. 


Example 1.6 (Inverse scattering problem) 

Find the shape of a scattering object, given the intensity (and phase) of sound 
or electromagnetic waves scattered by this object. The corresponding direct 
problem is that of calculating the scattered wave for a given object. 
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males ‘ 
a yoo 


More precisely, the direct problem can be described as follows. Let a bounded 
region D C RY (N = 2 or 3) be given with smooth boundary OD (the scattering 
ik0-2 where k > 0 denotes the wave 
number and @ is a unit vector that describes the direction of the incident wave. 
The direct problem is to find the total field u = u’+u* as the sum of the incident 


field u* and the scattered field u® such that 


object) and a plane incident wave u'(x) = e 


Au+ku=0 inRY\D, u=0 ondD, (1.2a) 


Ou 
Or 


—ikue = Oe we) for r = |x| > 00 uniformly in sal (1.2b) 


Iz" 


For acoustic scattering problems, u(x,t) = u(x)e~*”* describes the pressure and 
k = w/c is the wave number with speed of sound c. For suitably polarized time 
harmonic electromagnetic scattering problems, Maxwell’s equations reduce to 
the two-dimensional Helmholtz equation Au + k?u = 0 for the components of 
the electric (or magnetic) field u. The wave number k is given in terms of the 
dielectric constant € and permeability ~ by k = \/epw. 

In both cases, the radiation condition (1.2b) yields the following asymptotic 
behavior: 


7 exp(tk|ax x - 
ue(z) = aie tC) + O(|z|-AtY/) as |x| > 00, 


where # = /|x|. The inverse problem is to determine the shape of D when the 
far field pattern u..(&) is measured for all @ on the unit sphere in RY. 

These and related inverse scattering problems have various applications in 
computer tomography, seismic and electromagnetic exploration in geophysics, 
and nondestructive testing of materials, for example. An inverse scattering 
problem of this type is treated in detail in Chapter 7. 

Standard literature on these direct and inverse scattering problems are the 
monographs [53, 55, 176] and the survey articles [50, 248]. 


Example 1.7 (Computer tomography) 

The most spectacular application of the Radon transform is in medical imaging. 
For example, consider a fixed plane through a human body. Let p(x, y) denote 
the change of density at the point (x,y), and let LZ be any line in the plane. 
Suppose that we direct a thin beam of X-rays into the body along DL and 
measure how much of the intensity is attenuated by going through the body. 
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Let L be parametrized by (s,5), where s € R and 6 € [0,7). The ray Ds5 has 
the coordinates 

se +iue® EC, wER, 
where we have identified C with R?. The attenuation of the intensity J is 
approximately described by dJ = —ypI du with some constant y. Integration 
along the ray yields 


InI(u) = -1f p (se’ + ite’) dt 
Uo 


or, assuming that p is of compact support, the relative intensity loss is given by 
co 


InI (co) = -¥ / p (se”? + ite’) dt. 


In principle, from the attenuation factors we can compute all line integrals 
co 


(Rp)(s,6) = / p (se’? + tue’) du, s€R, 6€ (0,7). (1.3) 
—Cco 
Rp is called the Radon transform of p. The direct problem is to compute the 
Radon transform Rp when p is given. The inverse problem is to determine the 
density p for a given Radon transform Rp (that is, measurements of all line 
integrals). 
The problem simplifies in the following special case, where we assume that 
p is radially symmetric and we choose only vertical rays. Then p = p(r), 
r= \/a?+y?, and the ray L, passing through (2,0) can be parametrized by 
(x,u), u€ R. This leads to (the factor 2 is due to symmetry) 


V(x) := InI(oo) = =2y | (va? +1?) du. 
0 


Again, we assume that p is of compact support in {x : || < R}. The change of 
variables u = Vr? — x? leads to 
R 


V(a2) = n | as p(r)dr = n | a p(r) dr. (1.4) 


x 


1.1 Examples of Inverse Problems 5 


A further change of variables z = R? —r? and y = R? —2? transforms this equa- 
tion into the following Abel’s integral equation for the function z + p(vV R2 — z): 


The standard mathematical literature on the Radon transform and its applica- 
tions are the monographs [128, 130, 206]. We refer also to the survey articles 
[131, 183, 185, 192). 

The following example is due to Abel himself. 
Example 1.8 (Abel’s integral equation) 
Let a mass element move along a curve [ from a point p; on level h > 0 to 


a point po on level h = 0. The only force acting on this mass element is the 
gravitational force mg. 


y 


The direct problem is to determine the time T in which the element moves from 

py, to po when the curve [ is given. In the inverse problem, one measures the 

time T = T(h) for several values of h and tries to determine the curve [. Let 

the curve be parametrized by x = w(y). Let p have the coordinates (w(y), y). 
By conservation of energy, that is, 


heey 
E+U = i +mgy = const = mgh, 


we conclude for the velocity that 


ds 
—— = V2g(h—y). 
ag = B g(h—y) 
The total time T from p, to po is 
1 
T = )- [- 1+ Hy)" dy for h > 0. 
2g ( 2g (h—y) 
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Set o(y) = /1+"(y)? and let f(h) := T(h)V/2g be known (measured). Then 
we have to determine the unknown function ¢ from Abel’s integral equation 


vh-y 


A similar—but more important—problem occurs in seismology. One studies the 
problem to determine the velocity distribution c of the Earth from measurements 
of the travel times of seismic waves (see [29]). 

For further examples of inverse problems leading to Abel’s integral equations, 
we refer to the lecture notes by R. Gorenflo and S. Vessella [108], the monograph 
[198], and the papers [179, 270]. 


h 
/ OW): ps PUR: teehs0: (1.6) 
0 


Example 1.9 (Backwards heat equation) 
Consider the one-dimensional heat equation 


Ou(a, t) 07 u(z, t) 


pt ae (x,t) € (0,7) x Rso, (1.7a) 
with boundary conditions 
u(0,t) = u(z,t) = 0, t>0, (1.7b) 
and initial condition 
u(z,0) = u(x), O<aK<z. (1.7c) 


The separation of variables leads to the (formal) solution 


TT 


= Dime *sin(nz) with a, = = | wo(y) sin(ny)dy. (1.8) 
7 
i 0 


The direct problem is to solve the classical initial boundary value problem: Given 
the initial temperature distribution ug and the final time T, determine u(-,T). 
In the inverse problem, one measures the final temperature distribution u(-, T) 
and tries to determine the temperature at earlier times t < T’,, for example, the 
initial temperature u/(-, 0). 

From the solution formula (1.8), we see that we have to determine uo := 
u(-,0) from the following integral equation: 


u(az,T) = = [Kew uo(y)dy, O<aK<z, (1.9) 
ts 
where 
- de ne ? sin(na) sin(ny) . (1.10) 


We refer to the monographs 117, _ 198] and papers [31, 43, 49, 80, 81, 94, 
193, 247] for further reading on this subject. 
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Example 1.10 (Diffusion in an inhomogeneous medium) 
The equation of diffusion in an inhomogeneous medium (now in two dimensions) 
is described by the equation 


Ou(a, t) 
Ot 


= tats (y(z)Vu(z,t)), ceD,t>0, (1.11) 
c 


where c is a constant and y = 7(«) is a parameter describing the medium. In 
the stationary case, this reduces to 


div(yVu) = 0 inD. (1.12) 


The direct problem is to solve the boundary value problem for this equation for 
given boundary values u|ap and given function y. In the inverse problem, one 
measures u and the flux yOu/Ov on the boundary OD and tries to determine 
the unknown function y in D. This is the problem of impedance tomography 
which we consider in more detail in Chapter 6. 

The problem of impedance tomography is an example of a parameter identi- 
fication problem for a partial differential equation. Among the extensive litera- 
ture on parameter identification problems, we only mention the classical papers 
[166, 225, 224], the monographs [15, 17, 198], and the survey article [200]. 


Example 1.11 (Sturm—Liouville eigenvalue problem) 

Let a string of length Z and mass density p = p(x) > 0,0 < a < L, be fixed 
at the endpoints x = 0 and x = L. Plucking the string produces tones due to 
vibrations. Let u(x,t), 0 <a < L,t > 0, be the displacement at x and time t. 
It satisfies the wave equation 


67 u(a, t) 07 u(a, t) 


p(t) sa = aa, O<a2<L,t>0, (1.13) 


subject to boundary conditions v(0,t) = v(L,t) = 0 for t > 0. 
A periodic displacement of the form 


v(a,t) = w(x) [acoswt + bsinwt] 


with frequency w > 0 is called a pure tone. This form of v solves the bound- 
ary value problem (1.13) if and only if w and w satisfy the Sturm—Liouville 
eigenvalue problem 


w(x) + wpo(z)w(z) = 0,0<2<L, w(0)=w(L)=0. (1.14) 


The direct problem is to compute the eigenfrequencies w and the correspond- 
ing eigenfunctions for known function p. In the inverse problem, one tries to 
determine the mass density p from a number of measured frequencies w. 

We see in Chapter 5 that parameter estimation problems for parabolic and 
hyperbolic initial boundary value problems are closely related to inverse spectral 
problems. 
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Example 1.12 (Inverse Stefan problem) 

The physicist Stefan (see [253]) modeled the melting of arctic ice in the summer 
by a simple one-dimensional model. In particular, consider a homogeneous block 
of ice filling the region 7 > @ at time t = 0. The ice starts to melt by heating 
the block at the left end. Thus, at time t > 0 the region between x = 0 and 
x = s(t) for some s(t) > 0 is filled with water, and the region x > s(t) is filled 
with ice. 


x 


Let u(x,t) be the temperature at 0 < x < s(t) and time t. Then u satisfies the 
one-dimensional heat equation 
2 
Oulz,t) _ Fue) 5, p= {(a,t) €R2:0<2< alt), t>0} (1.18) 
Ot Ox? 
subject to boundary conditions 2u(0, t) = f(t) and u(s(t),t) = 0 for ¢ € [0,7] 
and initial condition u(x,0) = uo(x) forO<a< £. 

Here, uo describes the initial temperature and f(t) the heat flux at the left 
boundary « = 0. The speed at which the interface between water and ice 
moves is proportional to the heat flux. This is described by the following Stefan 
condition: 


att = euts()-0 for t € [0,7]. (1.16) 
The direct problem is to compute the curve s when the boundary data f and uo 
are given. In the inverse problem, one has given a desired curve s and tries to 
reconstruct u and f (or uo). 
We refer to the monographs [39, 198] and the classical papers [40, 95] for a 
detailed introduction to Stefan problems. 


In all of these examples, we can formulate the direct problem as the evaluation 
of an operator K acting on a known “model” 2 in a model space X and the 
inverse problem as the solution of the equation K(x) = y: 


Direct problem: given x (and K), evaluate K(x). 
Inverse problem: given y (and K), solve K(x) = y for x. 


In order to formulate an inverse problem, the definition of the operator K, 
including its domain and range, has to be given. The formulation as an oper- 
ator equation allows us to distinguish among finite, semifinite, and infinite- 
dimensional, linear and nonlinear problems. 
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In general, the evaluation of K(x) means solving a boundary value problem 
for a differential equation or evaluating an integral. 

For more general and “philosophical” aspects of inverse theory, we refer to 
(7, 214]. 


1.2 Ill-Posed Problems 


For all of the pairs of problems presented in the last section, there is a funda- 
mental difference between the direct and the inverse problems. In all cases, the 
inverse problem is ill-posed or improperly posed in the sense of Hadamard, while 
the direct problem is well-posed. In his lectures published in [115], Hadamard 
claims that a mathematical model for a physical problem (he was thinking in 
terms of a boundary value problem for a partial differential equation) has to be 
properly posed or well-posed in the sense that it has the following three proper- 
ties: 


1. There exists a solution of the problem (existence). 
2. There is at most one solution of the problem (uniqueness). 
3. The solution depends continuously on the data (stability). 


Mathematically, the existence of a solution can be enforced by enlarging the 
solution space. The concept of distributional solutions of a differential equation 
is an example. If a problem has more than one solution, then information about 
the model is missing. In this case, additional properties, such as sign conditions, 
can be built into the model. The requirement of stability is the most important 
one. If a problem lacks the property of stability, then its solution is practically 
impossible to compute because any measurement or numerical computation is 
polluted by unavoidable errors: thus the data of a problem are always perturbed 
by noise! If the solution of a problem does not depend continuously on the data, 
then in general the computed solution has nothing to do with the true solution. 
Indeed, there is no way to overcome this difficulty unless additional information 
about the solution is available. Here, we remind the reader of the following 
statement (see Lanczos [171]): 


A lack of information cannot be remedied by any mathematical trickery! 
Mathematically, we formulate the notation of well-posedness in the following 
way. 


Definition 1.13 (Well-posedness) 
Let X and Y be normed spaces, and K : X — Y a linear operator. The equation 
Ka = y ts called properly posed or well-posed if the following holds: 


1. Existence: For every y € Y, there is (at least one) « € X such that 
Kaz=y. 


2. Uniqueness: For every y € Y, there is at most onex € X with Kx = y. 
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8. Stability: The solution x depends continuously on y; that is, for every 
sequence (tn) in X with Ka, > Ka (n > oo), it follows that tn > x 
(n > co). 


Equations for which (at least) one of these properties does not hold are called 
improperly posed or ill-posed. 


In Chapter 4, we will extend this definition to local ill-posedness of nonlinear 
problems. 

It is important to specify the full triple (X,Y, A) and their norms. Exis- 
tence and uniqueness depend only on the algebraic nature of the spaces and the 
operator, that is, whether the operator is onto or one-to-one. Stability, how- 
ever, depends also on the topologies of the spaces, that is, whether the inverse 
operator K~!: Y + X is continuous. 

These requirements are not independent of each other. For example, due to 
the open mapping theorem (see Theorem A.27 of Appendix A.3), the inverse 
operator K~! is automatically continuous if K is linear and continuous and X 
and Y are Banach spaces. 

As an example for an ill-posed problem, we study the classical example given 
by Hadamard in his famous paper [115]. 


Example 1.14 (Cauchy’s problem for the Laplace equation) 
Find a solution u of the Laplace equation 


Pula, y) 4 A u(x, y) 


Au(z,y) := x2 Dy? = 0 inRx (0,0) (1.17a) 
that satisfies the “initial conditions” 
0 


Oy 


where f and g are given functions. Obviously, the (unique) solution for f(2) = 0 
and g(x) = +sin(nz) is given by 


1 
u(x, y) = 72 sin(nx)sinh(ny), «ER, y>0. 


Therefore, we have 


1 
sup {|f(7)| +|g(@)|} = ——0, noo, 
ceER n 
but ‘ 
sup |u(z,y)| = —sinh(ny) —> 00, noo 
zER n 
for all y > 0. The error in the data tends to zero while the error in the solution 
u tends to infinity! Therefore, the solution does not depend continuously on the 
data, and the problem is improperly posed. 
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Many inverse problems and some of the examples of the last section (for 
further examples, we refer to [111]) lead to integral equations of the first kind 
with continuous or weakly singular kernels. Such integral operators are compact 
with respect to any reasonable topology. The following example will often serve 
as a model case in these lectures. 


Example 1.15 (Differentiation) 
The direct problem is to find the antiderivative y with y(0) = 0 of a given 
continuous function x on [0,1], that is, compute 


y(t) = [ro ds, teé[0,1). (1.18) 
0 


In the inverse problem, we are given a continuously differentiable function y on 
(0, 1] with y(0) = 0 and want to determine x = y’. This means we have to solve 
the integral equation Ka = y, where K : C[0,1] > C[0, 1] is defined by 


t 


(Ka)(t) := fo) ds, té€ 0,1], forxeC0,1]. (1.19) 
0 
Here, we equip C[0,1] with the supremum norm ||z||. := anax |x(t)|. The 


solution of Ka = y is just the derivative x = y’, provided y(0) = 0 and y is 
continuously differentiable! If x is the exact solution of Ka = y, and if we 
perturb y in the norm || - ||, then the perturbed right-hand side 7 doesn’t 
have to be differentiable, and even if it is the solution of the perturbed problem 
is not necessarily close to the exact solution. We can, for example, perturb y 
by dsin(t/6?) for small 6. Then the error of the data (with respect to |] - ||.) 
is 5 and the error in the solution is 1/6. The problem (K,C[0,1],C[0,1]) is 
therefore ill-posed. 

Now we choose a different space Y := {y € C1(0, 1] : y(0) = 0} for the right- 
hand side and equip Y with the stronger norm ||z||c1 := qnax |a’(t)|. If the 
right-hand side is perturbed with respect to this norm ||- ||o1, then the problem 
(K , C0, 1], Y) is well-posed because K : C[0,1] — Y is boundedly invertible. 
This example again illustrates the fact that well-posedness depends on the topol- 
ogy. 

In the numerical treatment of integral equations, a discretization error can- 
not be avoided. For integral equations of the first kind, a “naive” discretization 
usually leads to disastrous results as the following simple example shows (see 
also [267]). 


Example 1.16 
The integral equation 


1 
fe au(s)ds = y(t), O<t<1, (1.20) 
0 
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with y(t) = (exp(t + 1) — 1)/(t + 1), is uniquely solvable by x(t) = exp(t). We 
approximate the integral by the trapezoidal rule 


1 n-1 
1 1 
[etatsras xh 5 2(0) + 5 @(1) + ) eI" o(jh) 
0 j=l 


with h := 1/n. For t = th, we obtain the linear system 


n-1 
1 Ve : 
h{ =a + =ea, + Sve a2, | = y(ih), 1=0,...,n. 1.21 
2 2 j 
j=l 


Then x; should be an approximation to x(ih). The following table lists the 
error between the exact solution x(t) and the approximate solution 2; for t = 0, 
0.25, 0.5, 0.75, and 1. Here, i is chosen such that ih = t. 


t n=4|/n=8]/n=16 |] n= 32 
0 0.44 0.47 1.30 41.79 
0.25 0.67 2.03 39.02 78.39 
0.5 0.95 4.74 15.34 1.72 
0.75 1.02 3.08 15.78 2.01 
1 1.09 1.23 0.91 20.95 


We see that the approximations have nothing to do with the true solution 
and become even worse for finer discretization schemes. 


In the previous two examples, the problem was to solve integral equations 
of the first kind. Integral operators are compact operators in many natural 
topologies under very weak conditions on the kernels. The next theorem implies 
that linear equations of the form Ka = y with compact operators K are always 
ill-posed. 


Theorem 1.17 Let X, Y be normed spaces and K : X > Y be a linear compact 
operator with nullspace N(K) := {a € X : Kx = O}. Let the dimension 
of the factor space X/N(K) be infinite. Then there exists a sequence (an) 
in X such that Kx, — 0 but (a,) does not converge. We can even choose 
(an) such that ||an||x — co. In particular, if K is one-to-one, the inverse 
Kk-+:Y D R(K) > X is unbounded. Here, R(K) := {Kx €Y: a € X} 
denotes the range of K. 


Proof: We set N = N(K) for abbreviation. The factor space X/N is a 
normed space with norm ||[z]|| := inf{||z + z||x : z € NV} since the nullspace 
is closed. The induced operator K : X/N > Y, defined by K({a]) := Ka, 
[c] € X/N, is well-defined, compact, and one-to-one. The inverse K~! : Y D 
R(K) — X/N is unbounded since otherwise the identity J = K~1K : X/N > 
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X/N would be compact as a composition of a bounded and a compact operator 
(see Theorem A.34). This would contradict the assumption that the dimension 
of X/N is infinite (see again Theorem A.34). Because AK! is unbounded, there 
exists a sequence ([z,]) in X/N with Kz, — 0 and ||[z,]|| = 1. We choose 
Un € N such that ||Zn + Unl|x > § and set py := (Zn + Un)/y/||K2n||- Then 
Kay, > 0 and |\2,||x — co. 


1.3. The Worst-Case Error 


We come back to Example 1.15 of the previous section: Determine x € C{0, 1] 
such that ie x(s) ds = y(t) for all t € [0,1]. An obvious question is: How large 
could the error be in the worst case if the error in the right side y is at most 
6? The answer is already given by Theorem 1.17: If the errors are measured in 
norms such that the integral operator is compact, then the solution error could 
be arbitrarily large. For the special Example 1.15, we have constructed explicit 
perturbations with this property. 

However, the situation is different if additional information is available. 
Before we study the general case, we illustrate this observation for a model 
example. 

Let y and y be twice continuously differentiable and let a number EF’ > 0 be 
available with 

lly"llo $B and |l9"llo < E. (1.22) 
Set z := y—y and assume that z’(0) = z(0) = 0 and z'(t) > 0 for t € [0,1]. 
Then we estimate the error  — z in the solution of Example 1.15 by 


ls()-2@)/? = 2? = [Zeer = 2 f 2'(s) dvds 
0 


0 


< 1B f 2!(s)ds = 4E2z(t). 
0 
Therefore, under the above assumptions on z = y — y we have shown that 
|Z — tll < 2VEO if ||g — yllo < 6 and E is a bound as in (1.22). In this 
example, 2V Ed is a bound on the worst-case error for an error 6 in the data 
and the additional information ||x’||.. = ||y”|lo. < E on the solution. 
We define the following quite generally. 


Definition 1.18 Let Kk: X + Y be a linear bounded operator between Banach 
spaces, X C X a subspace, and ||- ||¢ a “stronger” norm on X; that is, there 
exists c > 0 such that ||x||x <cllz|| 2 for alla € X. Then we define 


F (6, E, || | 


x) = sup {|lzllx :2€ X, [Kelly <6, llelly < ES, (1.23) 


and call F (6, E, || - ||.¢) the worst-case error for the error 6 in the data and a 
priori information ||x\|¢ < E. 
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F (5, E, || - ||.~) depends on the operator K and the norms in X, Y, and X. 
It is desirable that this worst-case error not only converges to zero as 6 tends 
to zero but that it is of order 6. This is certainly true (even without a priori 
information) for boundedly invertible operators, as is readily seen from the 
inequality ||x||x < ||K71||c(y,x) ||Ka|ly. For compact operators K, however, 
and norm || - ||¢ = || - ||x, this worst-case error does not converge (see the 
following lemma), and one is forced to take a stronger norm || - || ¢. 


Lemma 1.19 Let K : X > Y be linear and compact and assume that X/N(K) 
is infinite-dimensional. Then for every E > 0, there exists c > 0 and 69 > 0 
such that F (6, E, || - ||x) > ¢ for all 6 € (0, 60). 


Proof: Assume that there exists a sequence 6,, > 0 such that F (On, E, || - 
|x) + 0 asn— oo. Let K : X/N(K) > Y be again the induced operator in 
the factor space. We show that K~! is bounded: Let K ( al) = Eig So 
Then there exists a subsequence (2m,,) with ||Kam,,||y <n for all n. We set 


Linn > if |lzm,, |x < E, 
— 


E||2mnllx' fmnr if (lem, ||x > E- 


Then ||2Zn||x < E and ||[Kzn|ly < dn for alln. Because the worst-case error tends 
to zero, we also conclude that ||z,||x — 0. From this, we see that z, = %m,, 
for sufficiently large n; that is, m,, > 0 as n— oo. This argument, applied to 
every subsequence of the original sequence (2), yields that x,, tends to zero 
for m — oo; that is, K~! is bounded on the range R(K) of K. This, however, 
contradicts the assertion of Theorem 1.17. 


In the following analysis, we make use of the singular value decomposition 
of the operator Kk (see Appendix A.6, Definition A.56). Therefore, we assume 
from now on that X and Y are Hilbert spaces. In many applications X and 
Y are Sobolev spaces; that is, spaces of measurable functions such that their 
(generalized) derivatives are square integrable. Sobolev spaces of functions of 
one variable can be characterized as follows: 


t 
H? (a,b) := ¢ « € C?—I[a, b] : c?- Y(t) sat | was, ac€R, pe L? 


a 


(1.24) 
for p EN. 


Example 1.20 (Differentiation) 
As an example, we study differentiation and set X = Y = L7(0,1), 


(K2)(t) = fo ds, t€(0,1), e€ L7(0,1), 
0 
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and 
X, = {x ¢H'(0,1): (1) =0}, (1.25a) 
X, := {x € H?(0,1): 2(1) =0, x(0) =0}. (1.25b) 
We define ||x||1 := ||x’||p2 for « € X41, and ||a||2 := ||x"||;2 for  € X,. Then 
the norms || - ||;, 7 = 1,2, are stronger than || - ||z2 (see Problem 1.2), and we 
can prove for every EF > 0 and 6 > 0: 
F(5,E,||- 1) < VOB and F(6,2£,||-\\2) < 2B. (1.26) 


From this result, we observe that the possibility to reconstruct x is dependent on 
the smoothness of the solution. We come back to this remark in a more general 
setting (Theorem 1.21). We will also see that these estimates are asymptotically 
sharp; that is, the exponent of 6 cannot be increased. 


Proof of (1.26): First, assume that « € H'(0,1) with x(1) = 0. Partial inte- 
gration, which is easily seen to be allowed for H+-functions and the Cauchy— 
Schwarz inequality, yields 


lols =f x(t) nteat 
0 
= - [2 fo) ds| dt + x(t) f 2(s) ds 
0 0 0 t=0 
= - {2 (Ka)(t)dt < ||Kallzll2’||z2- (1.27) 
0 


This yields the first estimate. Now let x € H?(0,1) such that x(1) = 0 and 
x'(0) = 0. Using partial integration again, we estimate 


l!2"lIz2 


fro x’ (t) dt 
0 


2 / a(t) a(t) dt < |jellco|la”llze. 
0 


Now we substitute this into the right-hand side of (1.27): 


lIellZ2 < |Kallzalle'lze2 < KellzeV|lellce Vile" Iz - 


From this, the second estimate of (1.26) follows. 
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This example is typical in the sense that integral operators are often smooth- 
ing. We can define an abstract “smoothness” of an element « € X with respect 
to a compact operator kK : X — Y by requiring that x € R(K*) or x € R(K*K) 
or, more generally, © € R((K*K)?/?) for some real 0 > 0. The operator 
(K*K)?/? from X into itself is defined as 


(K*K)°/?¢ = S| i @, 2;) xj, crEXx, 
jed 


where {j;,0;,y; : 9 € J} is a singular system for K (see Appendix A.6, 
Theorem A.57 and formula (A.47)). We note that R(K*) = R((K*K)'/?), 
Picard’s Theorem (Theorem A.58) yields that x € R((K*K)?/?) is equivalent 


: 2 . . . . . 
to a Keepet < co which is indeed a smoothness assumption in concrete 


applications. We refer to Example A.59 and the definition of Sobolev spaces of 
periodic functions (Section A.4) where smoothness is expressed by a decay of 
the Fourier coefficients. 


Theorem 1.21 Let X and Y be Hilbert spaces, and K : X + Y linear, com- 
pact, and one-to-one with dense range R(K). Let K* : Y — X be the adjoint 
operator. 


(a) Set X1 := R(K*) and |x|). = ||") a, for x € X1. Then 
F(5,E, |||) < Voz. 


Furthermore, for every E > 0 there exists a sequence 6; > 0 such that 
F (6;,E,|| + ||1) = \/0;E; that is, this estimate is asymptotically sharp. 


(b) Set Xo := R(K*K) and ||a\|2 := | (K*K)*| for x € Xg. Then 
F(6,E,||-|l2) < &3EY9, 


and for every E > 0 there exists a sequence 6; > 0 such that F(6;, E, || - 
I2) = 2B 


(c) More generally, for some o > 0 define Xq := R((K*K)°/?) and |lallo := 
| (K*K) 72 gore e X,. Then 


F(SE,| lle) < o7etV EV et, 


and for every E > 0 there exists a sequence 5; > 0 such that F(6;, E, || - 
a) = OD pow, 


The norms || - ||, || - ||2, and || - ||, are well-defined because K* and (K*K)?/? 
are one-to-one. In concrete examples, the assumptions x € R(A*) and x € 
R((K*K)?/? are smoothness assumptions on the exact solution x (together 
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with boundary conditions) as we have mentioned already before. In the pre- 
ceding example, where (Kz) = fo x( s) ds, the spaces R(K*) and R(K*K) 
coincide with the Sobolev spaces and : defined in (1.25a) and (1.25b) (see 

Problem 1.3). 

Proof of Theorem 1.21: (a) Let « = K*z € X, with ||Kaly <6 and |x|, < E; 

that is, ||z||y < &. Then 


IIe = (K*z,2)x = (2, Ka)y < |lelly |Kally < £6. 


This proves the first estimate. Now let {;,7;,y; :j € J} be a singular system 
for Kk (see Appendix A.6, Theorem A.57). Set @; = EK*y; = w,E a; and 
6; := w3E > 0. Then ||#;|]1 = E, || K&,;|ly = 6;, and ||&;||x = Lj EB = 6; E. 
This proves part (a). Part (b) is proven similarly or as a special case (a = 2) 
of part (c). 
(c) With a singular system {y1j;,2;,y; : 7 € J}, we have ||z||} = ies |p5l? 
where p; = (%,x;)x are the expansion coefficients of x. In the following esti- 
mate, we use Hélder’s inequality with p = (0 + 1)/o and q=o+1 (note that 
1/p+1/q=1): 


a/(o o\2/(o+1 
Wl = do leg? = d Coal us)?7/* (losl/u2) 
jEd jET 
1/p 1/q 
2pa/(o+1) o\ 29/(o+1) 
< [| So (lesl es)? S"(lesl/uz)* 
jet jed 
a/(o+1) 1/(o+1) 
= |S lel ig Vlerm 
jet jEeJ 


al(o * —o 2/(o+1 
= Kaz!) (Kt K) 2/7 o™, 


This ends the proof. 


Next, we consider Example 1.9 again. We are given the parabolic initial bound- 
ary value problem 
Ou(a, t) 07u(z, t) 


at a aa?” O<a<a7,t>0, 
x 


u(0,t) = u(7,t) =0, t>0, u(a,0) =uo(z), 0<a<m. 
In the inverse problem, we know the final temperature distribution u(x, T), 
0 < a < a, and we want to determine the temperature u(z,7) at time T € 
(0,7). As additional information, we also assume the knowledge of FE > 0 with 
Ilu(-,0)|lz2 < EB. 
The solution of the initial boundary value problem is given by the series 


7 
Co 


2 
u(a,t) = = S- ewe sin(nx) [row sin(ny) dy, O<a<7,t>0. 
n=1 0 
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We denote the unknown function by v := u(-,7), set X = Y = L?(0,7), and 


Len {" € L*(0,n):v= SP ane” sin(n-) with Yan < =} 


n=1 n=1 


and |lully = 4/5 O72, a2 for ve X. In this case, the operator K : X > Y is 


an integral operator with kernel 


9) co 
k(az,y) = 7 S- ges) sin(nz)sin(ny), 2z,y € [0,7], 


n=1 
(see Example 1.9). Then we have for any 7 € (0,T): 
F(5,B,||- |g) < Bt orl? (1.28) 


This means that under the information ||u(-,0)||;2 < E, the solution u(-,7) can 
be determined from the final temperature distribution u(-, 7), the determination 
being better the closer 7 is to T. 


Proof of (1.28): Let v € X. From the definition of X and 


Co 


(Kv)(2) = Se a, sin(nz) , 


n=1 


we conclude that the Fourier coefficients of v are given by exp(—n?r) ay, and 
those of Kv by exp(—n?T) an. Therefore, we have to maximize 


lee} 


a ee 
lvllz2on) = 7 DL lanl?e aii 
n=1 
subject to the constraints 
Tv oe 7 lore) r 
loll = 5 olan? < EB? and |Kull220.) = Sy Pee < &. 
n=1 aa 


From the Hélder inequality, we have (for p,q > 1 with 1/p+1/q = 1 to be 
specified in a moment): 


TT oe TT = 
2 2 
5 , lege —_— 5 . "eel? (|anl2/Pe-2 ) 
n=1 n=1 
ea 1/q oo 1/p 
TT TT 2 
< be 5 nF) (5 5 lan |e 72" ‘) : 
(3 n=1 2 n=1 


We now choose p = T/7. Then 1/p=7/T and 1/¢=1-—7/T. This yields the 
assertion. 
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The next chapter is devoted to the construction of regularization schemes 
that are asymptotically optimal in the sense that, under the information x € X, 
\lz||x < E, and Ila - ylly < 6, an approximation £ and a constant c > 0 are 
constructed such that I|z - || 5 < cF(6,£, || - ||z)- 

As the first tutorial example, we consider the problem of numerical differ- 
entiation; see Examples 1.15 and 1.20. 


Example 1.22 
Let again, as in Examples 1.15 and 1.20, ( = fi x( s)ds, t € (0,1). Solving 
Ka = y is equivalent to differentiating : ae fix h € a 1 / 2) and define the 
one-sided difference quotient by 
() iz (y(t+h)—y(t)], 0<t<1/2, 
U = 
+ [y(t) —y(t— h)], 1/2<t<l, 
for any y € L7(0,1). First, we estimate ||v — y'||,2 for smooth functions y; that 
is, y € H7(0,1). From Taylor’s formula (see Problem 1.4), we have 
tth 
yeh) = y(t) + yh + f (¢kh—s)y"(s)as: 


that is, 


h 
1 
= i (t+h—r)dr 
0 
for t € (0,1/2) and analogously for t € (1/2,1). Hence, we estimate 


1/2 
he fo y(t) |? dt 


1/2 
po t+h—r)y"(t+h-—s)dt| drds 
0 


IA 
oi . 


A 


1 
< ly'IRs | [rar] = 244" Ihe 


1/2 1/2 
iw (t +h —T)|?dt fw t+h-—s)|*dtdr ds 
0 
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and analogously for h? Ne |u(t) — y’(t)|?dt. Summing these estimates yields 


1 
lu — y'||z2 < —Eh, 


Ss 


where E is some bound on |ly’"||z,. 

Now we treat the situation with errors. Instead of y(t) and y(t +h), we 
measure 4(t) and y(t +h), respectively. We assume that ||%— y||2 < 6. Instead 
of u(t), we compute 6(¢) = +[g(t + h) — H(t)] /h for t € (0,1/2) or t € (1/2,1), 
respectively. Because 


y@tth)—yt+th)| | yO -y@| 
h h : 

we conclude that ||v — v||,2 < 26/h. Therefore, the total error due to the error 
on the right-hand side and the discretization error is 

jo -—y'|lz2 < |lo-ollz2 + |lu-y'Ilne < a0 2; Beh (1.29) 

BS L BSF a . . 

By this estimate, it is desirable to choose the discretization parameter h as 
the minimum of the right-hand side of (1.29). Its minimum is obtained at 


h = ,/2\/26/E. This results in the optimal error ||é — y'||,2 < 2W/2VE6. 

Summarizing, we note that the discretization parameter h should be of order 
\/6/E if the derivative of a function is computed by the one-sided difference 
quotient. With this choice, the method is asymptotically optimal under the 
information ||x’||;2 < E. 


|o(t) — v(f)| < 


The two-sided difference quotient is optimal under the a priori information 
|v’ ||,2 < E and results in an algorithm of order 57/3 (see Example 2.4 in the 
following chapter). 

We have carried out the preceding analysis with respect to the L?-norm 
rather that the maximum norm, mainly because we present the general theory 
in Hilbert spaces. For this example, however, estimates with respect to ||-||.o are 
simpler to derive (see the estimates preceding Definition 1.18 of the worst-case 
error). 

The result of this example is of practical importance: For many algorithms 
using numerical derivatives (for example, quasi-Newton methods in optimiza- 
tion), it is recommended that you choose the discretization parameter € to be 
the square root of the floating-point precision of the computer because a one- 
sided difference quotient is used. 


1.4 Problems 


1.1 Show that equations (1.1) and (1.20) have at most one solution. 
Hints: Extend p in (1.1) by zero into R, and apply the Fourier trans- 
form and the convolution theorem. For (1.20) use results of the Laplace 
transform. 
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1.2 Let the Sobolev spaces X; and X» be defined by (1.25a) and (1.25b), 
respectively. Define the bilinear forms by 


ee / Hie sod aly = / x(t) y(t) dt 
0 0 


on X; and X 2, respectively. Prove that = ; are Hilbert spaces with respect 
to the inner products (-,-);, 7 = 1,2, and that ||z||;2 < ||a||; for all e € X;, 
G12: 


1.3 Let K : L?(0,1) + L?(0,1) be defined by (1.19). Show that the ranges 
R(K*) and R(K*K) coincide with the spaces X; and X2 defined by 
(1.25a) and (1.25b), respectively. 


1.4 Prove the following version of Taylor’s formula by induction with respect 
to n and partial integration: 


Let y € H”"*1(a,b) and t,t +h € [a,b]. Then 


™ y(t 
yé+h) = S- : + ) h® + Rp(tsh), 
k=0 : 


where the error term is given by 
t+h 


1 
R,(t;h) = = pern- s)" yt (s) ds. 


t 


1.5 Verify the assertions of Example A.59 of Appendix A.6. 


® 


Check for 
updates 


Chapter 2 


Regularization Theory for 
Equations of the First Kind 


We saw in the previous chapter that many inverse problems can be formulated 
as operator equations of the form 


Ka = y, 


where K is a linear compact operator between Hilbert spaces X and Y over the 
field K = R or C. We also saw that a successful reconstruction strategy requires 
additional a priori information about the solution. 

This chapter is devoted to a systematic study of regularization strategies 
for solving Kx = y. In particular, we wish to investigate under which con- 
ditions they are asymptotically optimal, that is, of the same asymptotic order 
as the worst-case error. In Section 2.1, we introduce the general concept of 
regularization. In Sections 2.2 and 2.3, we study Tikhonov’s method and the 
Landweber iteration as two of the most important regularization strategies. In 
these three sections, the regularization parameter a = a(6) is chosen a priori, 
that is, before we start to compute the regularized solution. We see that the 
optimal regularization parameter a depends on bounds of the exact solution; 
they are not known in advance. Therefore, it is advantageous to study strate- 
gies for the choice of a that depend on the numerical algorithm and are made 
during the algorithm (a posteriori). Different a posteriori choices are studied in 
Sections 2.5—2.7. 

All of them are motivated by the idea that it is certainly sufficient to com- 
pute an approximation «°° of the solution x such that the norm of the defect 
Kx% — y° is of the same order as the perturbation error 6 of the right-hand 
side. The classical strategy, due to Morozov [194], determines a by solving a 
nonlinear scalar equation. To solve this equation, we still need a numerical 
algorithm such as the “regula falsi” or the Newton method. In Sections 2.6 
and 2.7, we investigate two well-known iterative algorithms for solving linear 
(or nonlinear) equations: Landweber’s method (see [172]), which is the steepest 
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descent method, and the conjugate gradient method. The choices of a are made 
implicitly by stopping the algorithm as soon as the defect: ||Ka™ — y°||y is less 
than ré. Here, r > 1 is a given parameter. 

Landweber’s method and Morozov’s discrepancy principle are easy to investi- 
gate theoretically because they can be formulated as linear regularization meth- 
ods. The study of the conjugate gradient method is more difficult because the 
choice of a depends nonlinearly on the right-hand side y. Because the proofs in 
Section 2.7 are very technical, we postpone them to an appendix (Appendix B). 


2.1 A General Regularization Theory 


For simplicity, we assume throughout this chapter that the compact operator 
K is one-to-one. This is not a serious restriction because we can always replace 
the domain X by the orthogonal complement of the kernel of K. We make the 
assumption that there exists a solution z* € X of the unperturbed equation 
Ka* = y*. In other words, we assume that y* € R(K). The injectivity of K 
implies that this solution is unique. 
In practice, the right-hand side y* € Y is never known exactly but only up 
to an error of, say, 6 > 0. Therefore, we assume that we know 6 > 0 and y® € Y 
with 
ly" —ylly <6. (2.1) 


It is our aim to “solve” the perturbed equation 
Ke = y°. (2.2) 


In general, (2.2) is not solvable because we cannot assume that the measured 
data y® are in the range R(K) of K. Therefore, the best we can hope is to 
determine an approximation x° € X to the exact solution «* that is “not much 
worse” than the worst-case error F (6, E, || - || ¢) of Definition 1.18. 

An additional requirement is that the approximate solution x° should depend 
continuously on the data y°. In other words, it is our aim to construct a suitable 
bounded approximation R : Y — X of the (unbounded) inverse operator K~! : 
R(K) > X. 


Definition 2.1 A regularization strategy is a family of linear and bounded oper- 
ators 
Ra: Y — xX, a>O, 


such that 
lim Rak x = az foralxEex; 
a> 


that is, the operators RaK converge pointwise to the identity. 
From this definition and the compactness of kK, we conclude the following. 


Theorem 2.2 Let Ry be a regularization strategy for a compact and injective 
operator kK: X + Y where dim X = co. Then we have 
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(1) The operators Ra are not uniformly bounded; that is, there exists a 
sequence (aj) with ||Ra,\|ccv,.x) 7 00 for j + oo. 


(2) The sequence (RaKx) does not converge uniformly on bounded subsets of 
X; that is, there is no convergence RK to the identity I in the operator 
norm. 


Proof: (1) Assume, on the contrary, that there exists c > 0 such that 
|Ralley.x) < ¢ for alla > 0. From Ray > K7'y (a > 0) for all y € R(K) 
and ||Ray||x < cllylly for a > 0, we conclude that ||K~+y||x < elly|ly for every 
y € R(K); that is, K~' is bounded. This implies that J = K~'K : X > X is 
compact, a contradiction to dim X = oo. 

(2) Assume that Rak — I in £(X,X). From the compactness of Rak and 
Theorem A.34, we conclude that J is also compact, which again would imply 
that dim X < oo. 


The notation of a regularization strategy is based on unperturbed data; that 
is, the regularizer Ray* converges to x* for the exact right-hand side y* = Ka™*. 

Now let y* € R(K) be the exact right-hand side and y* € Y be the measured 
data with ||y* — y°||y < 5. We define 


z*? := Ray’ (2.3) 


as an approximation of the solution x* of Ka* = y*. Then the error splits into 
two parts by the following obvious application of the triangle inequality: 


jae? —a*\_x < ||Roy®-— Ray*|lx + ||Ray* — 2*||x 
< |Ralleax lly’ -y"lly + ||RoKa* - 2*||x 
and thus 
Ila? — a" |x < d\Rallew.x) + ||RoKa* — 2*||x. (2.4a) 


Analogously, for the defect in the equation we have 


[Ka —y* lly < 6K Raley) + ||KRay* —y' lly. (2.4b) 


These are our fundamental estimates, which we use often in the following. 

We observe that the error between the exact and computed solutions con- 
sists of two parts: The first term on the right-hand side of (2.4a) describes 
the error in the data multiplied by the “condition number” ||Ra||ccy,x) of the 
regularized problem. By Theorem 2.2, this term tends to infinity as a tends 
to zero. The second term denotes the approximation error ||(Ra — K~')y*||x 
at the exact right-hand side y* = Ka*. By the definition of a regularization 
strategy, this term tends to zero with a. The following figure illustrates the 
situation (Figure 2.1). 
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error 4 


-(|Roka* — 2*||x 


|| Rall 6 


| > 
a* a 


Figure 2.1: Behavior of the total error 


We need a strategy to choose a = a(d) dependent on 6 in order to keep the 
total error as small as possible. This means that we would like to minimize 


6||Rallewy,x) + ||Roka* — 2*l|x . 


The procedure is the same in every concrete situation: One has to estimate the 
quantities ||Ra||ccy,x) and ||R,.Ka* — x*||x in terms of a and then minimize 
this upper bound with respect to a. Before we carry out these steps for two 
model examples, we introduce the following notation. 


Definition 2.3 A parameter choice a = a(d) for the regularization strategy Ra 
is called admissible if lims_,9 a(d) = 0 and 


sup{|| Rasy? —a|lx: y? EY, ||Ka— y ly < 6} >0, 650, 
for everyxrEe X. 


From the fundamental estimate (2.4a), we note that a parameter choice 
a = a(6) is admissible if a(5) > 0 and 4||Rais)||c~v.x) + 0 as 6 > 0 and 
||R.Ka—2\|x ~0asa—>0. 

Example 2.4 lene differentiation by two-sided difference quotient) 

Let again (Kx) ( = fa s)ds, t € (0,1); that is, solving Ka = y is equivalent to 
differentiating is our aim to compute the derivative of y by the two-sided 
difference quotient (see Example 1.22 for the one-sided difference quotient). 
Here a = h is the stepsize, and we define 


Liay(t+ 2) —y(t+h)—3y(é)], O<t<4, 
(Ray)(t) = zly(t+ 4) —y(t— 4)], Reet F 
¢[3y() +y(@—A) -—4y(¢-48)], 1-$ <t< i 


for y € L?(0,1). In order to prove that R;, defines a regularization strategy, 
it suffices to show that R;,K are uniformly bounded with respect to h in the 
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operator norm of L?(0, 1) and that ||R_K«—<|| ,2 tends to zero for smooth x (see 
Theorem A.29 of Appendix A.3). Later, we show convergence for  € H?(0,1). 

The fundamental theorem of calculus (or Taylor’s formula from Problem 1.4 
for n = 0) yields 


tth/2 h/2 


a r me: ' h h 
(Rat) = + f vod =F f yorods, Bet<i-$, 

t—h/2 —h/2 

and thus 
1-h/2 h/2 h/2 1—-h/2 
1 

femora = bf ff vorove+oaaras, 

h/2 —h/2—h/2 h/2 


The Cauchy—Schwarz inequality yields 
1—-h/2 


|(Ray)(t)Pdt < |ly'|22- 
h/2 


From (Rpy)(t) = 4[y(t + h/2) — y(t)]/h — [y(t + h) — y(t)|/h for 0 <t <h/2 
and an analogous representation for 1—h/2 < t < 1, similar estimates yield the 
existence of c > 0 with 

|Rnyllze < elly'llze 


for all y € H'(0,1). For y = Kz, x € L7(0,1), the uniform boundedness of 
(RrK) follows. 

Now let x € H?(0,1) and thus y = Kx € H3(0,1). We apply Taylor’s 
formula (see Problem 1.4) in the form (first again for h/2<t<1-—h/2) 


h h? poe? 
y(t + h/2) y(t) + 5 y (t) 5 y(t) = 5 / s? y" (t+ h/2—s) ds. 
0 
Subtracting the formulas for + and — yields 
h/2 
(Ra) —v(t) = sp f sly t+h/2—s) tye h/2+s)] ds, 


0 


and thus by changing the orders of integration and using the Cauchy—Schwarz 
inequality 


1—h/2 h/2 ° 


1 1 
(Ri) OPat < jyll"MRe | f sas} = so lhe 


h/2 0 
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Similar applications of Taylor’s formula in the intervals (0,h/2) and 
(1 — h/2,1) yield an estimate of the form 
< C1 E h? 


|R,Ka = £\|12 = \|Rny — y'l|L2 


for all a € H?(0,1) with ||x"||;2 < E. Together with the uniform boundedness 
of R),K, this implies that R, Kx — x for all x € L?(0,1). 

In order to apply the fundamental estimate (2.4a), we must estimate the 
first term, that is, the L?-norm of R,. It is easily checked that there exists 
c2 > 0 with ||Rnyl|z2 < cally||z2/h for all y € L7(0,1). Estimate (2.4a) yields 


. é 
Ray? —2*\lb2 < cae + Cee 


where E is a bound on ||(2*)"||z2 = ||(y*)/"||z2.. Minimization with respect to 
h of the expression on the right-hand side leads to 


h(6) = ¢V0/E and ||Rycsyy’ — a" |[2 < EEN 8 


for some c > 0 and €= c2/c+ qc’. 

We observe that this strategy is asymptotically optimal for the information 
\|a""||p2 < E because it provides an approximation x° that is asymptotically not 
worse than the worst-case error (see Example 1.20). 


The (one- or two-sided) difference quotient uses only local portions of the 
function y. An alternative approach is to first smooth the function y by molli- 
fication and then to differentiate the mollified function. 


Example 2.5 (Numerical pe by oe 
Again, we define the operator (Kx) ( = fF a( s)ds, t € [0,1], but now as an 
operator from the (closed) oo 


i2(0,1) 2 {re 20, 1) _f aya =o} 
0) 


of £7(0,1) into £7(0, 1). 
We define the Gaussian kernel wW, by 


volt) = <a exn(-H/at), CER, 


where a > 0 denotes a parameter. Then {°° a(t) dt = 1, and the convolution 


(Way) (t = f voto = [ vate y(t—s)ds, teR, 
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exists and is an L?-function for every y € L?(R). Furthermore, by Young’s 
inequality (see [44], p. 102), we have 


WN 


Ida * yllza@Ry < [Mallee llyllz2@ = llyllzeq for all y € L*(R). 


Therefore, the operators y 4 Wa * y are uniformly bounded in L?(R) with 
respect to a. We note that vw, *y is infinitely often differentiable on R for every 
y € L?(R). 

We need the two convergence properties 


Iba * Z— 2||z2—@) 30 asa—+0 for every z € L”(0,1) (2.5a) 


and 
Ilda *2—2\|n2@) < V2allz"l|z20,1) (2.5b) 


for every z € H'(0,1) with z(0) = z(1) = 0. Here and in the following, we 
identify functions z € L?(0,1) with functions z € L?(IR) where we think of 
them being extended by zero outside of [0, 1]. 
Proof of (2.5a), (2.5b): It is sufficient to prove (2.5b) because the space 
{z € H1(0,1) : z(0) = 2(1) = 0} is dense in L?(0,1), and the operators 
z+ Wo * z are uniformly bounded from L?(0,1) into L?(R). 

Let the Fourier transform be defined by 


co 


i 2(s)e"'ds, teER, 


—Co 


(Fz)(t) := 


for z € S, where the Schwarz space S is defined by 
S := {# €C™(R): sup|t? 2) (t)| < oo for all p,q € No . 
teR 


With this normalization, Plancherel’s theorem and the convolution theorem 
take the form (see [44]) 


Fzllezq@) = llellez@, F(u*z)(t) = V20 (Fu)(t) (Fz), teR, 


for all z,u € S. Because S is dense in L?(R) the first formula allows it to 
extend F to a bounded operator from L?(R) onto itself (see Theorem A.30), 
and both formulas hold also for z € L?(IR). Now we combine these properties 
and conclude that 


Ita *¥%—2||L2R) = ||F (ba * 2) — Fell pap aa || [V2m F(a) — 1] Fz 2) 


for every z € L?(0,1). Partial integration yields that 


FA = = / Pa oe 
0 


at 


Vir 


fo eds = (—it) (Fz)(t) 
0 
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for all z € H+(0,1) with z(0) = z(1) = 0. We define the function ¢. by 


1 1 

a(t) = = [1—-V2nFlba)] = = [1 = ae , teR. 
at at 

Then we conclude that 


= || ea F(z) \| aq < IlPalloo IF (2’Vile2(e) 


Ita * z— 2||22(R) 
= |lPalloo llz’Ilz2(0,1)- 


From 


Se _ en (at/2)? 
sal = 5 Tepe - 


and the elementary estimate [1 — exp(—r”)]/r7 < 2/2 for all r > 0, the desired 


estimate (2.5b) follows. 
After these preparations, we define the regularization operators Ra 


L*(0,1) — £9(0,1) by 


(Rov) = Flvat w(t) — f Fa* ws) ds 
0 


= (W«y)(t) - F (Wi, + y)(s) ds 
0 


for t € (0,1) and y € L7(0,1). First, we note that Ra is well-defined, that is, 
maps L?(0,1) into L2(0,1) and is bounded. To prove that Ra is a regularization 
strategy, we proceed as in the previous example and show that 


(i) ||Rayllzz < ak \|y\|z2 for all a > 0 and y € L7(0,1), 


(ii) ||RaKa||~2 < 2||z||z2 for alla > 0 and x € L2(0, 1), that is, the operators 
R.K are uniformly bounded in L2(0,1), and 
(iii) ||Ro Ka — a||p2 < 2V2a||2’||z2 for all a > 0 and 2 € H4,(0,1), where we 


have set 


Hay(0, 1) 2= {2 € H'(0,1): z(0) =2(1) =0, [x ds = of ; 
0 


To prove part (i), we estimate with the Young’s inequality 


|Rayllzz01) < 2lva*yllez@2) S 21d. * yllez@) 
4 
ee 


2 |IWollziy Ily\lz2(0,1) S aVT II¢/ll22 (0,1) 


IA 
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for all y € L?(0,1) because 


f 2 
eo llzamy = —2 | va(s)ds = 2¥.(0) = —H. 
| an/T 


This proves part (i). 
Now let y € H+(0,1) with y(0) = y(1) = 0. Then, by partial integration, 


(wa * ¥)( n= fun t—s)y totem [vata (s) ds = (ha *y')(t). 
0 


Taking y = Kz, x € L2(0,1) yields 


(RoKx)(t) = (a *2)(t) — / (ba * 2)(s) ds. 
0 


Part (ii) now follows from Young’s inequality. 
Finally, we write 


(RaKa)(t)— a(t) = (Ya * #)(t) — a(t) — fice * x)(s) — 2(s)] ds 
0 


because fo 2( s)ds =0. Therefore, by (2.5b), 


||RaKe —al|r2¢01) < 2|!da*e—al|z2¢01) S 2V2a 2" |[r2 


for all x € Hj,(0,1). This proves part (iii). 

Now we conclude that Ra Kx converges to x for any x € L2(0,1) by (ii), (iii), 
and the denseness of Hp (0, 1) in L2(0,1). Therefore, Ry defines a regularization 
strategy. From (i) and (iii), we rewrite the fundamental estimate (2.4a) as 


46 
Ro 6 a ee 
| y v Ilz2 => ont 


E, y = Ka, and y*® € L?(0,1) such that 
c\/0/E again leads to the optimal order 


+ 2V2aE 


if 2* € Hao(0,1) with ||(2*)'||z2 
ly> — y*\|z2 < 6. The choice a 
O(V5E). 

For further applications of the mollification method, we refer to the mono- 
graph by Murio [198]. There exists an enormous number of publications on 
numerical differentiation. We mention only the papers [3, 69, 74, 165] and, for 
more general Volterra equations of the first kind, [25, 26, 76, 77, 178]. 


| IA 


A convenient method to construct classes of admissible regularization strate- 
gies is given by filtering singular systems. Let kK : X — Y be a linear 
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compact operator, and let {u;,2;,y; : j € J} be a singular system for K 
(see Appendix A.6, Definition A.56, and Theorem A.57). As readily seen, 
the solution « of Kx = y is given by Picard’s theorem (see Theorem A.58 
of Appendix A.6) as 


a 
LS Soy, Yy)Y V5 (2.6) 
jai M9 


provided the series converges, that is, y € R(K). This result illustrates again 
the influence of errors in y because the large factors 1/4; (note that 4; > 0 as 
j — co) amplify the errors in the expansion coefficients (y, y;)y. We construct 
regularization strategies by damping the factors 1/,;. 


Theorem 2.6 Let K : X — Y be compact and one-to-one with singular system 
{5.25.47 27 © N} and 
q: (0,00) x (0, IK llecx.y)] —R 
be a function with the following properties: 
(1) |a(a,4)| <1 for all a> 0 and 0< p< [|Klleccy)- 
(2) For every a > 0, there exists c(a) such that 


Iga, “)| < e(a)m forall 0< p< |[Kllecxyy- 
(3a) lim, g(a, 2) = 1 for every 0 << \Klecx.y): 
Then the operator Ra: Y > X, a> 0, defined by 


= 4a, 143) 
Ray = Dey Made es Wey. (2.7) 
j=l 


is a regularization strategy with ||Rallccy.x) < cla) and ||KRal|cyvy) < 1. A 
choice w = a(6) is admissible if a(d) + 0 and dc(a(d)) + 0 as 6 +0. The 
function q is called a regularizing filter for K. 


Proof: |The operators R, are bounded because we have by assumption (2) that 


foe) 
Rayllk = S [al (a, 145)] 1G yjy |? 
P= ap 
loo) 
< a? Slgw)yP < ela)? llyll¥; 
j=l 


that is, ||Rallccy.x) < c(a). From 


q(@, Hy) 
KRay = 57 lots) ne “(yyj)y Ka; = 5 a a0; Hy) (Ys Ys )¥ Yas 
j=l 7 j=l 
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we conclude that |/KRayl? = D2, lala) (yyy) 2 < lly and thus 
|K Rallccvy) < 1. Furthermore, from 


CO 


Rake = SMO Hs) Ca, ya) and « = S“(@,2,)x Bis 
= j=1 
j= i= 
and (Kz,y;)y = (x, K*y;)x = u;(x,2;)x, we conclude that 


co 


Ro Ka — a = Do fa(o, us) - 1] [(ese9)xP. (2.8) 


Here, K* denotes the adjoint of K (see Theorem A.24). This fundamental 
representation will be used quite often in the following. Now let x € X be 
arbitrary but fixed. For ¢ > 0 there exists N € N such that 


co 


2 
€ 
yee’ re 


j=N+1 
By (3a) there exists ap > 0 such that 


2 


[a(a, uj) -— 1]? < a for all 7 =1,...,N and0<a<ago. 
2||xIIx 
With (1) we conclude that 
~ 2 
|RoKa— alk = Sd o[a(a,u;) — 1)" le, 2)x/? 

j=l 
- 20 1s (a, 3) — 1)” |(a, 23) x? 
j=N+ 
2 2 


2 € 2 
< +-—< 
alate 2 METAS Spee 


for all 0 < a < ap. Thus, we have shown that 


lim RyKa2 = x forevery re X. 
a0 


Using this and 6||Ravs)||ccv.x) < dc(a(d)) 4 0 in the fundamental estimate, 
(2.4a) ends the proof. 


In this theorem, we showed convergence of Ray to the solution «. As Exam- 
ples 2.4 and 2.5 indicate, we are particularly interested in optimal strategies, 
that is, those that converge of the same order as the worst-case error. We see 
in the next theorem that a proper replacement of assumption (3a) together 
with the abstract smoothness assumption that x belongs to the range of K* or 
(K* K)?/?, respectively, leads to such optimal strategies. 
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Theorem 2.7 Let the assumptions (1) and (2) of the previous theorem hold. 
Let (3a) be replaced by the stronger assumption: 


(3b) For every o > 0, there exists a continuous function we : (0,00) > (0,0) 
with limg-+0 We (a) = 0 and 


Le’ |q(a, pf) —1| < we(a) for alla>0 and0< p< ||K\lccx,y)- (2.9) 
Then 


|Roka—allx < wo(a) |lz|lx , (2.10a) 
|KR Ka — Kally < wi(a) |le||x. (2.10b) 


If, furthermore, x € R((K*K)°/?), then 


|KRoKx— Kelly < wosi(a) lellx, (2.11b) 


[Rake —allx < wo(a)|lz|lx, (2.11a) 


where « = (K*K)?/2z. We note that the powers (K*K)?/? are defined by the 
singular system; see (A.47). In the case 0 = 1 we can replace the assumption 
a = (K*K)\/2z by « = Kz for some z € Y, and (2.11a), (2.11b) hold with 
lzlly replacing |\z[|x. 


Proof: Estimate (2.10a) follows immediately from (2.8) and the definition of 
wo(a). Furthermore, from 


aE: 


KR Ke — Kau = [a(a, uj) — 1] (2, 2;)x Ka; 


&. 
Il 
nn 


T 
lad: 


[a(a, my) — 1] by (@, 25) x yy 


&. 
Il 
nn 


we conclude that || KAR Ka — Ke||? < w1(a)? ||z||% which proves (2.10b). Let 
now x = (K*K)?/2z for some z € X. Then (x,2;)x = MS (z,2;)x and, again 
from formula (2.8), 


Co 


2 oO 
||RaKa — xl = S~ [alo oy) — 1] u77I(2,25)x? < we(ar)? llzll%. 
j=l 


Estimate (2.11b) is proven analogously. 


We substitute these estimates into the fundamental estimates (2.4a) and 
(2.4b). Therefore, under the assumptions of the previous theorem, 


ad 


I|j2°" —a*||_x < l|Ralleey.x) + Wola) [l2llx » (2.12a) 
< 


Ka? — y* lly 6 + Wo+1(@)|l2I|x , (2.12b) 
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for o > 0 where x* = (K*K)?/2z. Note that we used |KRallewv) < 1 (see 
Theorem 2.6). 


There are many examples of functions q : (0,00) x (0, || A’||c(x,v)] > R that 
satisfy assumptions (1), (2), and (3a—b) of the preceding theorems. We study 
two of the following three filter functions in the next sections in more detail. 


Theorem 2.8 The following three functions q satisfy the assumptions (1), (2), 
(3a), and (3b) of Theorems 2.6 or 2.7, respectively: 


(a) g(a, ph) = p?/(a+ p?). This choice satisfies (2) with c(a) = 1/(2Va). 
Assumption (8b) holds with we(a) = Ca a?!? ifo <2 and wWe(a) < co a if 
a >2. Here cg is independent of a. It is cy = 1/2 and co = 1. 


(b) g(a, w) = 1—(1—ap?)'/% for some 0 <a< WK leony: In this case (2) 


holds with c(a) = \/a/a, and (8b) is satisfied with wo(a) = (ee aft? 
for allo,a> 0. 


(c) Let q be defined by 
2 
= ea 
dan = {9 tne 


In this case (2) holds with c(a) = 1/./a, and (3b) is satisfied with we (a) = 
a?/? for all o,a > 0. 


Therefore, all of the functions q defined in (a), (b), and (c) are regularizing 
filters. 


Proof: For all three cases, properties (1) and (3a) are obvious. 
(a) Property (2) follows from the elementary estimate ane < Wa for all 
a, t > 0 (which is equivalent to (u — \/a)? > 0). 
We observe that 1—¢(a, ) = a/(a+yp7). For fixed a > 0 and o > 0, we define 
the function 

a pl? 


f(t) — u? (1 — g(a, p)) 7 at pe’ O<pU< po, 


and compute its derivative as 


! o— =2 “a 
f(u) = ape CHE ee 


Then f is monotonically increasing for ¢ > 2 and thus f() < f(uo) < wo? 


Q. 
If o < 2, then we compute its maximum as p/?,,. = ¢% with value f(JUmaz) < 


Co a?/2. We leave the details to the reader. 
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(b) Property (2) follows immediately from Bernoulli’s inequality: 


2 2 
tt any x a (1 “) eee 
a a 


thus |q(a, #)| < Vlq(a,4)| < Va/an. 
(3b) is shown by the same method as in (a) when we define 


fw) = w?(L—-ala,u)) = wd -ap?)"*, OS usa, 


with derivative f’(u) = p?~!(1—ap?)/°-1 [o(1— ay?) —2ap?]. Then ap?,g, = 
Zyeag With value f oe Co ao! 2 Again, we leave the details to the reader. 
(c) For property (2), it is sufficient to consider the case ? > a. In this case, 
q(a,) = 1 < p/Va. For (3b), we consider only the case uw? < a and have 


pw? (1—a(a,)) =u? < ar, 


We will see later that the regularization methods for the first two choices of 
q admit a characterization that avoids knowledge of the singular system. The 
choice (c) of q is called the spectral cutoff. The spectral cutoff solution 2° € X 
is therefore defined by 


s 1 
ge a i (y°, us)y Xj. 
p2>a = 
For this spectral cutoff solution, we combine the fundamental estimates (2.12a), 
(2.12b) with the previous theorem and show the following result. 


Theorem 2.9 (a) Let K : X + Y be a compact and injective operator with 
singular system {u;,2;,y; : 7 € N}. The operators 


1 
Ray = YS) —(wyyay, yeY, (2.13) 
J 


pba 
define a regularization strategy with ||Ral|ccv.x) < 1/Va. Thus the parameter 
choice a = a(d) is admissible if a(d) 4 0 (6 > 0) and 6?/a(d) +0 (5 > 0). 


(b) Let Kx* = y* and y®> € Y be such that ||y> — y*||y < 6. Furthermore, let 
v* = K*z € R(K") with |\z||) < E andc>0. For the choice a(d) = cé6/E, we 
have the following error estimates for the spectral cutoff regularization. 


Fesaaseana _ ale < (= ao ve) VOE, (2.14a) 
Ka — yt, < (l+o)6. (2.14b) 
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(c) Let a* = (K*K)?/?z € R((K*K)?/”) for some o > 0 with |l2||_x < E. The 
choice a(5) = cd?/(°+) leads to the estimates 

[|aoten? _ x | 


1 
€, - el) gcitot+)) pifle+t) = (2.140) 


|r? — y* ||, (1+ cl tY/?) 5, (2.14d) 


IA 


Therefore, the spectral cutoff regularization is optimal under the information 
\|(K*)—la*||y < E or |\(K*K)~?/22*||x < E, respectively (if K* is one-to- 
one). 


Proof: Combining the fundamental estimates (2.12a), (2.12b) with Theorem 2.8 
(part (c)) yields the error estimates 


) 
—2"l|x < va * Vvallz|ly ; 


Kae? —y*|ly <6 + allzlly, 


Je 


for part (b) and 
5 


ale < = + at lale, 


Ja 


[Ke ae 28 pel [ale 


Ila 


for part (c). The choices a(6) = c6/E and a(6d) = c(6/E)?/(7+, respectively, 
lead to the estimates (2.14a), (2.14b) and (2.14c), (2.14d), respectively. 


The general regularization concept discussed in this section can be found 
in many books on inverse theory [17, 110, 182]. It was not the aim of this 
section to study the most general theory. This concept has been extended in 
several directions. For example, in [84] (see also [88]) the notations of strong and 
weak convergence and divergence are defined, and in [182] different notations of 
optimality of regularization schemes are discussed . 

The idea of using filters has a long history [109, 265] and is very convenient 
for theoretical purposes. For given concrete integral operators, however, one 
often wants to avoid the computation of a singular system. In the next sections, 
we give equivalent characterizations for the first two examples without using 
singular systems. 


2.2 Tikhonov Regularization 


A common method to deal with overdetermined finite linear systems of the form 
Ka = y is to determine the best fit in the sense that one tries to minimize the 
defect || Ka — y||y with respect to z € X for some norm in Y. If X is infinite- 
dimensional and K is compact, this minimization problem is also ill-posed by 
the following lemma. 
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Lemma 2.10 Let X and Y be Hilbert spaces, K : X — Y be linear and bounded, 
and y* € Y. There exists & € X with ||K&—y*|ly < ||Ka — y*|ly for all 
x eX if and only if & € X solves the normal equation K* Ki = K*y*. Here, 
k*:Y + X denotes the adjoint of K. 


Proof: A simple application of the binomial theorem yields 


[Ka —y"|l} — [Ke — y" lly 


2 Re(K& —y*, K(x — @)), dl 
= 2Re(K*(Ké@—-y*),2- 4), + |K(x-4)|lt 


for allz,@ € X. If @ satisfies K*K& = K*y*, then || Ka—y*||?-—||K2—-y*||?- > 0, 
that is, @ minimizes ||Ka—y*||y. If, on the other hand, # minimizes || Ka—y*|ly, 
then we substitute « = + tz for any t > 0 and z € X and arrive at 

O < 2¢ Re(K*(K@—y*),z), + P||Kzllz. 
Division by ¢ > 0 and t — 0 yields Re(K*(K@ — y*),z), > 0 for all z € X; 
that is, A*(k% — y*) =0, and @ solves the normal equation. 


As a consequence of this lemma, we should penalize the defect (in the lan- 
guage of optimization theory) or replace the equation of the first kind K* Kz = 
K*y* by an equation of the second kind (in the language of integral equation 
theory). Both viewpoints lead to the following minimization problem. 

Given the linear, bounded operator K : X — Y and y € Y, determine 
z® € X that minimizes the Tikhonov functional 


Jo(x) = ||Kx—yll} + allcl|% forze xX. (2.15) 
We prove the following theorem. 


Theorem 2.11 Let kK : X — Y be a linear and bounded operator between 

Hilbert spaces anda > 0. Then the Tikhonov functional Jo, has a unique mini- 

mum «x € X. This minimum x is the unique solution of the normal equation 
an” + K*Ka® = K*y. (2.16) 

The operator al+ K*K is an isomorphism from X onto itself for every a > 0. 

Proof: We use the following formula as in the proof of the previous lemma: 

Jo(x) —Jo(x®) = 2Re(Ke*—y, K(a— aN) ys + 2aRe(a%, x — 2%) x 

+ ||K(x—2°)|} + alle — 2%||& 

= 2Re(K*(Ka*—y)+an°,¢—2°), 

+ ||K(@ — #*)|I} + alle — 2° |[k (2.17) 


for all a € X. From this, the equivalence of the normal equation with the mini- 
mization problem for J, is shown exactly as in the proof of Lemma 2.10. Next, 
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we show that al + K*K is one-to-one for every a > 0. Let ax+ K* Ka = 0. 
Multiplication by x yields a(xz,x)x + (Ka, Kx)y = 0, that is, x = 0. Finally, 
we show that al + K*K is onto. Since al + K*K is one-to-one and self- 
adjoint, we conclude that its range is dense in X. It remains to show that 
the range is closed. Let z, = ax, + K* Ka, converge to some z € X. Then 
Zn — 2m = A(X — Lm) + K*K (apy — Lm). Multiplication of this equation by 
In — Lm yields 


alltn —tmll& + |K(en—em)Ib = (2a — 2m. 2n—@m)x 


S |l2n — 2mllx len — mlx - 


A 


From this, we conclude that ala, — mlx < ||2n — Zm||x. Therefore, (x,,) is a 
Cauchy sequence and thus convergent to some x € X which obviously satisfies 
axr+ K*Kau=z. 


The solution x°® of equation (2.16) can be written in the form «* = Ray 
with 
Ra := (aI + K*K)'K* :Y—3X. (2.18) 


Choosing a singular system {1;,2;,y; : 7 € N} for the compact and injective 
operator K, we see that Ray has the representation 


ly (Hy) 
Ray = Yo —*5(yy)y 2; = SoA“ Wyy)v aj, yeY, (2.19) 
nao & + BG n=0 Hj 


with g(a, w) = w?/(a +p). This function q is exactly the filter function that 
was studied in Theorem 2.8, part (a). Therefore, applications of Theorems 2.6 
and 2.7 yield the following. 


Theorem 2.12 Let kK : X — Y be a linear, compact, and injective operator 
and a > 0 and x* € X be the exact solution of Ka* = y*. Furthermore, let 
y° €Y with |ly° —y* lly < 6. 


(a) The operators Ry: Y > X from (2.18) form a regularization strategy with 
|Rallcew.x) <1/(2Va). It is called the Tikhonov regularization method. Ray? 
is determined as the unique solution x%° € X of the equation of the second kind 


axe? + K*Ke%? = Ky. (2.20) 


Every choice a(6) + 0 (5 4 0) with 67/a(d) + 0 (6 + 0) is admissible. 
(b) Let x* = K*z € R(K*) with ||z||y < E. We choose a(o) = cd6/E for some 
c>0. Then the following estimates hold: 


|Jc%(5)5 — o* |] 2 


5 (1/ve+ Ve) VoE, (2.21a) 
[Kae —y*lly < (L+e)6. (2.21b) 
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(c) For some o € (0,2), let x* = (K*K)?/2z € R((K*K)?/?) with |lz||_x < E. 
The choice a(6) = c(6/E)?/(°+ for ¢ > 0 leads to the error estimates 
1 
|Jxo(9).9 _ a*||x < Ge +5 ia gr keth) paerl) (2.21c) 
|x — y* lly << (Ltec41 TY?) 6. (2.21d) 


Here, cq are the constants for the choice of q of part (a) of Theorem 2.8. There- 
fore, fora <2 Tikhonov’s regularization method is optimal for the information 
|(K*)-la*|ly < E or ||(K*K)~°/22*||x < E, respectively (provided K* is one- 
to-one). 


Proof: Combining the fundamental estimates (2.12a), (2.12b) with Theo- 
rem 2.8 (part (a)) yields the error estimates 


va 
jn = aI] x + 2 Illy 


é 
2 asetat 
= De 


[Kae —y*llx < 6 + allally 


for part (b) and 


Iz? — 2*||x + ¢g a7! |l2l|x 


S 
= 9./a 
[Kae —y*\|x <8 + copa? t)/? |IzI]x 
for part (c). The choices a(6) = ¢6/E and a(d) = c(6/E)?/(7+, respectively, 
lead to the estimates (2.21a), (2.21b) and (2.21c), (2.21d), respectively. 


From Theorem 2.12, we observe that a@ has to be chosen to depend on 6 
in such a way that it converges to zero as 6 tends to zero but not as fast as 
6°. From parts (b) and (c), we conclude that the smoother the solution 2* is, 
the slower a has to tend to zero. On the other hand, the convergence can be 
arbitrarily slow if no a priori assumption about the solution «* (such as (b) or 
(c)) is available (see [243]). 

The case o = 2 leads to the order O(6?/%) for ||a%()-? —2* || x. It is surprising 
to note that this order of convergence of Tikhonov’s regularization method 
cannot be improved even if «* € R((K*K)?/) for o > 2. Indeed, we prove the 
following result. 


Theorem 2.13 Let K : X — Y be linear, compact, and one-to-one such that 
the range R(K) is infinite-dimensional. Furthermore, let x © X, and assume 
that there exists a continuous function a : [0,00) > [0, co) with a(0) = 0 such 
that 

li a(d),6 —2/3 = 

tm a" — a2 = 0 
for every y’ € Y with |ly° — Kally < 6, where x%© € X solves (2.20) for 
a=a(d). Then x =0. 


2.2 Tikhonov Regularization Al 


Proof: Assume, on the contrary, that « 4 0. 
First, we show that a(6) 6—?/3 > 0. Set y= Ka. From 


(a(6) 1+ K*K) (a2), —z) = K*(y>—y) — a(6)a, 
we estimate 
la) lellx < WK llecxyy6 + (a(6) + IK lzcxyy) lle?” — all x. 


We multiply this equation by 5~?/* and use the assumption that 2%)? tends 


to x faster than 62/3 to zero, that is, 


a(6),6 —2/3 


\|a —2\|x 6 > 0. 


This yields a(5) 6~?/3 > 0. 
In the second part we construct a contradiction. Let {u;,7;,y; :j € N} be 
a singular system for kK. Define 


bj) = wp and y c= y + djy;, GEN. 


Then 6; — 0 as j + oo and, with a; := a(6;) and 2% := (ajI+ K*K)~'y, 


e°hi — ge = (as — 2%) 4 (2% — 2) 
= (a;I+K*K) 'K*(5;y;) + (2% — 2) 
ae a; 
= a; + (xi — @). 
sities + (at — 2) 
Because also ||a% — a||x Pia * _, 0, we conclude that 
5 py . 
zy > 0, jro. 
But, on the other hand, 
1/3 
Oj) Hy M5 -2/3 


= = (1l+a;6 aa — 1, jroo. 


aj + pi aj + yi J 


This is a contradiction. 


This result shows that Tikhonov’s regularization method is not optimal for 
stronger “smoothness” assumptions «* € R((K miei =) for 0 > 2. This is in 
contrast to, for example, the spectral cutoff regularization (see Theorem 2.9 
above) or Landweber’s method or the conjugate gradient method, which are 
discussed later. 

The choice of a in Theorem 2.12 is made a priori, that is, before starting 
the computation of 7%? by solving the least squares problem. In Sections 2.5 
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to 2.7 we study a posteriori choices of a, that is, choices of a made during the 
process of computing 7°. 

It is possible to choose stronger norms in the penalty term of the Tikhonov 
functional. Instead of (2.15), one can minimize the functional 


|Ka—y° lj + alle|z on X1, 


where || - ||; is a stronger norm (or only seminorm) on a subspace X, C X. 
This was originally done by Phillips [217] and Tikhonov [261, 262] (see also 
(97]) for linear integral equations of the first kind. They chose the seminorm 
Ilz|]1 = |x’ |[z2 or the Ht-norm ||z||1 := (||xl|?2 +|lx"||22) 7. By characterizing 
|| - ||, through a singular system for K, one obtains similar convergence results 
as above in the stronger norm || - ||;. For further aspects of regularization with 
differential operators or stronger norms, we refer to [70, 119, 180, 205] and the 
monographs [110, 111, 182]. The interpretation of regularization by smoothing 
norms in terms of reproducing kernel Hilbert spaces has been observed in [133]. 


2.3. Landweber Iteration 


Landweber [172], Fridman [98], and Bialy [18] suggested to rewrite the equation 
Kae =y in the form « = (I-a k*K)x+ak*“y for some a > 0 and iterate this 
equation, that is, compute 


z° := 0 and 2” = (I-—akK*K)z™"' + aK*y (2.22) 


for m = 1,2,.... This iteration scheme can be interpreted as the steepest 
descent algorithm applied to the quadratic functional x +> || Ka — y||?- as the 
following lemma shows. 


Lemma 2.14 Let the sequence (x™) be defined by (2.22) and define the func- 
tional p : X +R by W(x) = 3||Ka — yl}, «EX. Then wp is Fréchet differen- 
tiable in every z € X and 


w(zje = Re(Kz—y,Kaz)y = Re(K*(Kz—y),2) aeX. (2.23) 


xX ? 


The linear functional w'(z) can be identified with K*(Kz—y) € X in the Hilbert 
space X over the field R. Therefore, c™ = 2™—-1—aK*(Ka™! — y) is the 
steepest descent step with stepsize a. 


Proof: The binomial formula yields 


Y(z+2) — Bl2) — Re(Ke-y,Ka)y = 5 (Kall, 


and thus 


1 
w(iz+e) — o(z) — Re(Kz—-y,Ka)y| < 5 IK lex) llell 5 
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which proves that the mapping x +> Re(Kz-—y, Kx)y is the Fréchet derivative 
of w at z. 


Equation (2.22) is a linear recursion formula for 7”. By induction with 
respect to m, it is easily seen that x2” has the explicit form x” = R,,y, where 
the operator R,, : Y + X is defined by 


m—1 
= a) (I-aK*K)*K* form=1,2,... . (2.24) 
k=0 
Choosing a singular system {11;,7;,y; : 7 € N} for the compact and injective 


operator K, we see that R,,y has the representation 


fore) m-1 


Rmy = @> > pj >> (1-ap2)* (y,ys)y 2; 
j=l  k=0 
1 
gai M4 
q(m, [1 
= yee ye Y, (2.25) 
n=0 Hj 


with q(m, u) = 1—(1—ap?)™. We studied this filter function g in Theorem 2.8, 
part (b), when we define a = 1/m. Therefore, applications of Theorems 2.6 
and 2.7 yield the following result. 


Theorem 2.15 Again let K : X — Y be a compact and injective operator and 
letO<a< 1/||K\eccy)- Let x* € X be the exact solution of Ka* = y*. 
Furthermore, let y° € Y with ||y® — y*|ly < 6. 

(a) Define the linear and bounded operators Ry: Y > X by (2.24). These oper- 
ators R,, define a regularization strategy with discrete regularization parameter 
a =1/m, m EN, and ||Rnlleov.x) < am. The sequence x? = Ryy? is 
computed by the iteration (2.22), that is, 


2 = 0 and «™® = (I-aK*K)z™ 19 + akK*y? (2.26) 


form =1,2,.... Every strategy m(6) + co (6 > 0) with 6? m(6) +0 (6 > 0) 
is admissible. 


(b) Again let x* = K*z © R(K*) with ||z||y < E and 0 < c, < cg. For every 
choice m(5) with c, 2 < m(4) < co %, the following estimates hold: 


x7") — oI << cg WOE, (2.27a) 
|Ka™ 9 yy << (14+1/(aex)) 5, (2.27b) 


for some cz depending on cy, cg, and a. Therefore, the Landweber iteration is 
optimal under the information ||(K*)~tz*||y < E. 
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(c) For some o > 0, let a* = (K*K)°/?z € R((K*K)°7/?) with |lz_x < E 

and let 0 < cy < cg. For every choice m(6) with c(E/5)?/°t) < m(d) < 
C2(E/6)?/(+)), we have 

|)c7(9),8 _ a*|\|x <b §7/(e+1) Ei/(e+1) ; (2.27c) 

|x)? —y*lly << 036, (2.27d) 


for some c3 depending on c1, C2, 0, anda. Therefore, the Landweber iteration 
is also optimal for the information ||(K*K)~?/22*||x < E for every o > 0. 


Proof: Combining the fundamental estimates (2.4a), (2.12a), and (2.12b) 
with Theorem 2.8 (part (b)) yields the error estimates 


| 
VY 2am ae 


% 1 
|Ka™? —y*Ily < 6 + —|lally, 
am 


m,o 


—2"*\|x < dVam + 


Ila 


for part (b) and 


m * o a 
a" —atllx < d¥am + (57) © Iellx, 
_ ; ot 1\ orn 
Kem? — yy < b+ (SE) alle, 28) 


for part (c). Replacing m in the first term by the upper bound and in the 
second by the lower bound yields estimates (2.27a), (2.27b) and (2.27c), (2.27d), 
respectively. 


The choice «° = 0 is made to simplify the analysis. In general, the explicit 
iteration x” is given by 


m-1 
a™ =a) (I-aK*K)*K*y + (I-aK*K)™2°, m=1,2,... . 
k=0 


In this case, Rm is affine linear, that is, of the form Ry = 2+ Smy, y € Y, 
for some z™ € X and some linear operator S;,,:Y > X. 

For this method, we observe again that high precision (ignoring the presence 
of errors) requires a large number m of iterations but stability forces us to keep 
m small enough. 

We come back to the Landweber iteration in the next chapter, where we 
show that an optimal choice of m(d) can be made a posteriori through a proper 
stopping rule. 

Other possibilities for regularizing first kind equations Ka = y with compact 
operators K, which we have not discussed, are methods using positivity or more 
general convexity constraints (see [27, 30, 235, 236]). 
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2.4 A Numerical Example 


In this section, we demonstrate the regularization methods by Tikhonov and 
Landweber for the following integral equation of the first kind: 


1 
[atts x(s) ds =e, 0<1<1, (2.29) 
0 
with unique solution x*(t) = 1 (see Problem 2.1). The operator K : L?(0,1) > 
L?(0,1) is given by 
1 
(Ka)(t) = | (1 + ts) e** x(s) ds 
0 
and is self-adjoint, that is, k* = kK. We note that x* does not belong to the 


range of K (see Problem 2.1). For the numerical evaluation of Kx, we use 
Simpson’s rule. With t; =7/n, i =0,...,n, n even, we replace (K'x)(t;) by 


i aaa j=O0orn, 
Sow; (L+tt;)e" a(t;) where w; = an j=1,3,...,n—-1, 
j=0 #, j=2,4,...,n-2. 


We note that the corresponding matrix A is not symmetric. This leads to the 
discretized Tikhonov equation a a%* + A? = Ay®. Here, y° = (y?) € R™1 
is a perturbation (uniformly distributed random vector) of the discrete right- 
hand y* = exp(i/n) such that 


* 1 “ * 
ly —wle = fag Gwe)? < 6. 
i=0 


The average results of ten computations are given in the following tables, where 
we have listed the discrete norms |1 — x%°|z of the errors between the exact 
solution x*(t) = 1 and Tikhonov’s approximation 2° (Table 2.1). 


Table 2.1: Tikhonov regularization for 6 = 0 


Qa n=8 n= 16 
10-* | 2.4410-* | 23%10-* 
10-2 | 7.2% 107? | 6.8 * 107? 
10-3 | 2.6% 1072 | 2.4 * 107? 
10-4 | 1.3*107? | 1.2* 107? 
10-5 | 2.6% 107% | 2.3 * 1073 
10-§ | 9.3%10-4 | 8.7* 1074 
10-7 | 3.5*10-4 | 4.4* 10-4 
10-8 | 1.3*1073 | 3.2* 1075 
10-9 | 1.6% 1073 | 9.3% 1075 
10-19 | 3.9% 1073 | 2.1% 1074 
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Table 2.2: Tikhonov regularization for 6 > 0 
Q 56 = 0.0001 


56 = 0.001 


6 =0.01 


d6=0.1 


107 0.2317 
10-2 0.0681 
1073 0.0238 
1074 0.0119 
107° 0.0031 
1076 0.0065 
1077 0.0470 
1078 0.1018 
1079 0.1730 
10710 1.0723 


0.2317 
0.0677 
0.0240 
0.0127 
0.0168 
0.0909 
0.2129 
0.8119 
1.8985 
14.642 


0.2310 
0.0692 
0.0268 
0.1172 
0.2553 
0.6513 
2.4573 
5.9775 
16.587 


0.2255 
0.1194 
0.1651 
1.0218 
3.0065 
5.9854 
30.595 


In the first table, we have chosen 6 = 0; that is, only the discretization error 
for Simpson’s rule is responsible for the increase of the error for small a. This 
difference between discretization parameters n = 8 and n = 16 is noticeable for 
a < 107-8. We refer to [267] for further examples (Table 2.2). 

In the second table, we always took n = 16 and observed that the total error 
first decreases with decreasing a up to an optimal value and then increases again. 
This is predicted by the theory, in particular by estimates (2.21a) and (2.21b). 

In the following table, we list results corresponding to the iteration steps for 
Landweber’s method with parameter a = 0.5 and again n = 16 (Table 2.3). 


Table 2.3: Landweber iteration 


m | 6=0.0001 | 6=0.001 | 6=0.01 | 6=0.1 
1 0.8097 0.8097 0.8088 0.8135 

2 0.6274 0.6275 0.6278 0.6327 

3 0.5331 0.5331 0.5333 0.5331 

4 0.4312 0.4311 0.4322 0.4287 

5 0.3898 0.3898 0.3912 0.3798 

6 0.3354 0.3353 0.3360 0.3339 

7 0.3193 0.3192 0.3202 0.3248 

8 0.2905 0.2904 0.2912 0.2902 

9 0.2838 0.2838 0.2845 0.2817 
10 0.2675 0.2675 0.2677 | 0.2681 
100 0.0473 0.0474 0.0476 0.0534 
200 0.0248 0.0248 0.0253 0.0409 
300 0.0242 0.0242 0.0249 0.0347 
400 0.0241 0.0241 0.0246 0.0385 
500 0.0239 0.0240 0.0243 0.0424 


We observe that the error decreases quickly in the first few steps and then 
slows down. To compare Tikhonov’s method and Landweber’s iteration, we 
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note that the error corresponding to iteration number m has to be compared 
with the error corresponding to a = 1/(2m) (see the estimates in the proofs 
of Theorems 2.15 and 2.12). Taking this into account, we observe that both 
methods are comparable where precision is concerned. We note, however, that 
the computation time of Landweber’s method is considerably higher than for 
Tikhonov’s method, in particular if the error 6 is small. On the other hand, 
Landweber’s method is stable with respect to perturbations of the right-hand 
side and gives very good results even for large errors 0. 

We refer also to Section 3.5, where these regularization methods are com- 
pared with those to be discussed in the subsequent sections for Symm’s integral 
equation. 


2.5 The Discrepancy Principle of Morozov 


The following three sections are devoted to a posteriori choices of the regular- 
ization parameter a. In this section, we study a discrepancy principle based 
on the Tikhonov regularization method. Throughout this section, we assume 
again that K : X —> Y is a compact and injective operator between Hilbert 
spaces X and Y with dense range R(K) C Y. Again, we study the equation 


Kz = 7 


for given perturbations y*® € Y of the exact right-hand side y* = Ka*. The 
Tikhonov regularization of this equation was investigated in Section 2.2. It 
corresponds to the regularization operators 


Ry = (af + K*K)'K* fora>0 


that approximate the unbounded inverse of K on R(K). We have seen that 
z® = Ray exists and is the unique minimum of the Tikhonov functional 


Ja(z) := ||Ka—yll? + allel, ceEeX, a>O. (2.30) 

More facts about the dependence on a and y are proven in the following theorem. 
Theorem 2.16 Lety € Y, a >0, and «% be the unique solution of the equation 
at® + K*Kka* = Ky. (2.31) 


Then «° depends continuously on y and a. The mapping a +> ||x°||x is 
monotonously nonincreasing and 


lim #* = 0. 
a—->oo 


The mapping a++ ||Ka* — y||y is monotonically nondecreasing and 


lim Ka® = y. 


a0 


Ify £0, then strict monotonicity holds in both cases. 
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Proof: We proceed in five steps. 
(i) Using the definition of J, and the optimality of x*, we conclude that 


alla" < Ja(@*) < Ja(0) = llylly, 
that is, ||[z“||x < |lylly/V/a. This proves that x* > 0 as a > oo. 


(ii) We choose a > 0 and 3 > 0 and subtract the equations for x® and «°°: 
a(ax* —2°) + K*K(2% —2°) + (a —A)z® = 0. (2.32) 
Multiplication by «% — x? yields 
alle* — "|| + |K(e*— 2°) |} = (8—a) (#8, 2% — 2°) x. (2.33) 
From this equation, we first conclude that 
alle — 29k < |B—al|(x?,a*—2*)x| < |B-al|la*llx lz*— 2° |x, 


that is, 
llylly 
B-a| —. 
vB 


alle*— 2°l|x < |8—allla"llx < 


This proves continuity of the mapping a+ 2. 


(iii) Now let @ > a > 0. From (2.33) we conclude that (x°,2* — x°)x is (real 
and) positive (if zero then 2% = a? and thus 2% = 2° = 0 from (2.32). This 
would contradict the assumption y # 0). Therefore, |x| < (a%,a%)x < 
|x? || x ||z*||x, that is, ||2°\|x < ||@*||x which proves strict monotonicity of 
ar |[2*||x. 


(iv) We multiply the normal equation for 2° by 2% — 2°. This yields 


B(x8,a% —2%)x + (Ka? — y, K(a* — 2°), = 0. 


Now let a > 3. From (2.33), we see that («°,2% — 7°)x <0; that is, 


0< (Ka® —y, K(a® —2°)), = (Ke? —y, Kz* —y)y — ||Kx® — yl}. 


The Cauchy—Schwarz inequality yields || Ka — y|ly < ||Ka* — ylly. 


(v) Finally, let ¢ > 0. Because the range of K is dense in Y, there exists 7 € X 
with || Kx — y||}?- < 7/2. Choose ap such that ao||z||% < e?/2. Then 


||Kx* — yl < Ja(e*) < Ja(z) < 7; 


that is, ||Ka* —ylly < e for alla < ao. 


Now we consider the determination of a(4) from the discrepancy principle; 
see [194-196]. We compute a = a(d) > 0 such that the corresponding Tikhonov 
solution «°°, that is, the solution of the equation 


are? + K* Ka = K*y?, 
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that is, the minimum of 
Jq,(t) 2= ||Ka—y ||P + allallx, 


satisfies the equation 
Kare? —y lly = 6. (2.34) 


Note that this choice of a by the discrepancy principle guarantees that, on the 
one side, the error of the defect is 6 and, on the other side, a is not too small. 

Equation (2.34) is uniquely solvable, provided 6 < |ly°||y because by the 
previous theorem 


lim Ka? —y?lly = |ly°lly > 6 
aco 
and 
lim ||K2™? — yy = 0 < 6. 
a0 
Furthermore, a +> || Ka — y°||y is continuous and strictly increasing. 


Theorem 2.17 Let K : X — Y be linear, compact, and one-to-one with dense 
range in Y. Let Ka* = y* with x* eX, y*EY, and y® €Y such that ||y® — y*|ly 
<6 <|ly°|ly. Let the Tikhonov solution x*©) satisfy || Ka% — y°||y = 6 for 
all 6 € (0,49). Then 


(a) x%)5 5 x* for 5 — 0; that is, the discrepancy principle is admissible. 
(b) Let c* = K*z € K*(Y) with ||z|ly < E. Then 


x09 _ a* lly < AVE. 


Therefore, the discrepancy principle is an optimal regularization strategy 
under the information ||(K*)~tz*||y < E. 


Proof: «° := #%)® minimizes the Tikhonov functional 
JO(a) = Jogsy,a(e) = a(6)|lall + [Ka —y° lly. 
Therefore, we conclude that 


a(5)||2° | + 6 = JO(@*) < JO(a*) 
a(S) [la + lly* — 9° lly 
< a(5)|2*Ik + &, 


I 


and hence ||x°||x < ||a*||x for all 6 > 0. This yields the following important 
estimate: 

IIe"Ilk — 2Re(a’,a*)x + |la*l|k 

2 [Il2* Ik — Re(x®, x*) x| = 2 Re(x* — 2°, x*)x. 


IIx? — 2* || 


IA 
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First, we prove part (b): Let 2* = K*z, z¢€Y. Then 


2 Re(a* — 2°, K*z)x = 2 Re(y* — K2°,z)y 
2 Re(y* —y°,z)y + 2 Re(y® — Ka, z)y 
2éllelly + 2d|lzly = 4é|zly < 468. 


IIx? — 2" 


IN 1A IA 


(a) Now let «* € X and € > 0 be arbitrary. The range R(K*) is dense in X 
because K is one-to-one. Therefore, there exists ¢ = K*z € R(A*) such that 
||@ — x*||x <e/3. Then we conclude by similar arguments as above that 


jc° —a*|/2 < 2 Re(a* —a°,a* —%)x + 2 Re(x* — 2°, K*z)x 


S| + 2 Re(y* — Kx®,z)y 


IA 


2||a* — x 


IA 


2||a* — x°| + 4d |lz|ly. 


wl m wl m 


This can be rewritten as (||2* — 2°||x — e/3)” < 7/9446 |lzlly. 
Now we choose 6 > 0 such that the right-hand side is less than 4e7/9. Taking 
the square root, we conclude that: ||z* — x°||x < e for this 6. 


The condition ||y°||y > 6 certainly makes sense because otherwise the right- 
hand side would be less than the error level 6, and x° = 0 would be an acceptable 
approximation to x”. 

The determination of a(6) is thus equivalent to the problem of finding the 
zero of the monotonic function ¢(a) := || Kx — y°||2- — 6? (for fixed 5 > 0). It 
is not necessary to satisfy the equation || Kx? — y°||y = 6 exactly. An inclusion 
of the form 

C160 < || Kav? - y ly < c26 


for fixed 1 < c, < cg is sufficient to prove the assertions of the previous theorem. 

The computation of a(d) can be carried out with Newton’s method. The 
derivative of the mapping a +» x? is given by the solution of the equation 
(al + K*K)#2%? = —x%?, as is easily seen by differentiating (2.31) with 
respect to a. 

In the following theorem, we prove that the order of convergence O(v5) is 
best possible for the discrepancy principle. Therefore, by the results of Exam- 
ple 1.20, it cannot be optimal under the information ||(K*K)~?/?2||x < E for 
o>. 


Theorem 2.18 Let K be one-to-one and compact and assume that there exists 
a > 0 such that for every x € R((K*K)?/?) with y = Kx £0, and all sequences 
bn —> 0 and y € Y with |ly — y|ly < dn and |ly"|ly > dn for all n, the 
Tikhonov solutions x” = «%n):5n (where a(dn) is chosen by the discrepancy 
principle) converge to x faster than V6, to zero, that is, 


1 
Von 


Then the range R(K) has to be finite-dimensional. 


nm 


la” — a||x > 0 asn—->oo. (2.35) 
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Proof: We show first that the choice of a(d) by the discrepancy principle 


implies the boundedness of a(5)/6. Abbreviating #° := 2%), we write for 
5 < sllylly 
1 2 
_ = (22 Z — 26 
sll» = (1-3) lly < tly 
< lly—ylly + lly’lly - 25 < |Iy*lly -6 
= lly’ lly — lly’ -Ka°lly < ||K2°lly 


= 
a(d) 


where we applied K to (2.31). Thus we have shown that there exists c > 0 with 
a(d) < cé for all sufficiently small 6. 

Now we assume that dimR(K) = oo and construct a contradiction. Let 
{H5,2;,y; : 7 © N} be a singular system of AK and define 


6 
«(0 6 2 
KA“ (y? — Ka )\ly < a(ay Milley) 


1 
©i=—a2x, and yo =Yyitonyn with ob, := Te 
M1 
Then y = Ka = y; and 6, > 0Oasn > o and a € R((K*K)?/?) for every 
o > Oand ly — ylly = bn < \/1 +62 = |ly*||y. Therefore, the assumptions 
for the discrepancy principle are satisfied and thus (2.35) holds. 
The solution of a(6,)a" + K*Ka" = K*y°> is given by 


gr = M1 7 f Ln On % 
a(n) +E an) FH” 
We compute 
1 (a(5n) +43) O(n) +R” 
and hence 
iL LnVon On 1 1 


ip" — ss = = 2 
Von Ie a\|x — a(dn) + Hid a(dn) + bn 1 is a(bn)/On = aie 


This contradicts (2.35). 


We remark that the estimate a(d) < HK lecx.vy/ (ly? lly —6) derived in the 
previous proof suggests to use 6||K||Z;x ¥y/(lly’ lly — 6) as a starting value for 
Newton’s method to determine a(d)! 

There has been an enormous effort to modify the original discrepancy prin- 


ciple while still retaining optimal orders of convergence. We refer to [86, 93, 
102, 209, 238]. 
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2.6 Landweber’s Iteration Method with 
Stopping Rule 


It is very natural to use the following stopping criteria, which can be imple- 
mented in every iterative algorithm for the solution of Ka = y. 

Let r > 1 be a fixed number. Stop the algorithm at the first occurrence 
of m € No with ||Ka™? — y°||y < rd. The following theorem shows that this 
choice of m is possible for Landweber’s method and leads to an admissible and 
even optimal regularization strategy. 


Theorem 2.19 Let K : X — Y be linear, compact, and one-to-one with dense 
range. Let Kx* = y* and y° € Y be perturbations with \ly* — y®\|y < 6 and 
ly \ly > r6 for all 6 € (0,59) where r > 1 is some fixed parameter (independent 
of 6). Let the sequence x™*, m = 0,1,2,... , be determined by Landweber’s 
method; that is, 2° =0 and 


grthe — gm? + a k* (y?—Ka™), m=0,1,2,... , (2.36) 
for some0<a< 1/\|K lec): Then the following assertions hold: 


(1) litt +00 || Kx? —y°||y = 0 for every 6 > 0; that is, the following stopping 
rule is well-defined: Let m = m(d) € No be the smallest integer with 
\|Ka™? — y\ly <r. 


(2) 6?m(6) + 0 for 5 + 0, that is, this choice of m(6) is admissible. There- 
fore, by the assertions of Theorem 2.15, the sequence z™):® converges to 
x* as 6 tends to zero. 


(3) If x* = K*z € R(K*) or x* = (K*K)°7/?z € R((K*K)?/?) for some 
a >0 and some z, then we have the following orders of convergence: 


Ix) _ aX << eWES or (2.37a) 
|) 79),8 _ x*||x ra c BEV t+) gece!) (2.37b) 
respectively, for some c > 0 where again E = ||z||. This means that this 


choice of m(d) is optimal for allo > 0. 


Proof: In (2.25), we showed the representation 


1 — (1— ay)” 


Rmy = S- ity 


j= 


(ys us)y @4 
for every y € Y and thus 


loc) Se 2 
|KRmy — lly = S00 anuj)?" |v, us)v] 
j=l 
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From [1 - aps | < 1, we conclude that ||KR,, — I||c¢y) < 1. Application to y 
instead of y yields 


m — m ] 2 
Ko? — PI = SU ap3)?" |v? av | 
j=l 
(1) Let e¢ > 0 be given. Choose 7; € N with 
oo 2 
2 € 
Ss? |@’.w)vl° < may 
J=n+1 


Because |1 — api5|?" — 0 as m — oo uniformly for j = 1,...,j1, we can find 
mo € N with 


ji 
S(t = an)" |v wav P 
j=l 


for m > mo. 


wm] % 


Ja 
2 2 
< max (L-ai)™ Do u)yl s 
ee = 
This implies that ||Ka™° — y°|/? < ? for m > mo; that is, the method is 
admissible. 
It is sufficient to prove assertion (2) only for the case m(d) > co. We set 


m :=m(06) for abbreviation and derive an upper bound of m. By the choice of 
m(6), we have with y* = Ka* 


|KRnay” yy > Bmw? — 9 lly — WE Bm—1 — Dy" — v)lly 
> rd—(KRp1—Tleryd > (r-1)5, (2.38) 


and hence 


m(r—1)? 6? 


IA 


mS (1 = ap?)?”-? |(y*, ys) |? 
j=l 


2yam—2 2 \(e*,a,)x|- (2.39) 


I 
tg 
2 
= 
| 
= 


We show that the series converges to zero as 6 + 0. (The dependence on 6 
is hidden in m.) First we note that my?(1 — ay?)?™~? < 1/a for all m > 1 
and all u > 0 (see Problem 2.6). Now we again split the series into a finite 
sum and a remaining series and estimate in the “long tail”, the expression 
m(1 — ap5)?™~? u5 by 1/a and note that m(1— ay5)?™~? tends to zero as 
m — oo uniformly in j € {1,..., 71}. This proves convergence and thus part (2). 

For part (3) we remind the reader of the fundamental estimate (2.4a), which 
we need in the following form: 
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le —2* Ix < dVam + |[RmKa* —2*llx. (2.40) 


We restrict ourselves to the case that «* = (K*K)?/?z for some a > 0. We set 
again E = ||z||x and estimate m = m(06) from above by using (2.38) and (2.28) 
for y* instead of y® (that is, 6 = 0) and m— 1 instead of m. 


(o+1)/2 
1 
TiS [eee aie = (2 E. 
2a(m — 1) 


Solving for m — 1 yields an estimate of the form 


m(5) < 2(m(5)—1) < (2)" an 


for some c which depends solely on o, r, and a. Substituting this into the first 
term of (2.40) for m = m(6) yields 5,/am(6) < Jac EV (+0 §2/(e+1), Next, 
we estimate the second term of (2.40). In the following estimates, we use again 
that x* = (K*K)°/?z and also Hélder’s inequality with p = 4+ and q=o+1. 


* * 12 
wn = 
|RmiKa* — x*||x 


= S (1-ap?)?"|(@*,2,)xP = $00 — ap?) uP? |(z, 23) xP 

j=l j=l 

— a 1/ ee 1/ 
= So [= ap? ym use? |(z, 05) x17] /? [C1 — ap2)?|(z, 25) xP] 

j=l 

~ 1/p 1/q 

< Es; (1 — a3)? 437 |(z, 09) x/? d (1 — ay:3)?”|(2,25)xP 
< |KRmy* — y@|POY Ber) 


. 2a0/(o+1 
< EMM KR my’ —y?lly + [(KRm — Dy yy 


Now we use the stopping rule for m = m(0) in the first term and the estimate 
|K. Rm — I||ccv) < 1 in the second one. This yields 


|| Rms) K2* _ a*||2, < (r+ jee) E2/ (e+) §20/(e+)) ; 


Substituting this into (2.40) for m = m(6) yields the assertion. 


It is also possible to formulate a similar stopping criterion for Morozov’s 
discrepancy principle. Choose an arbitrary monotonically decreasing sequence 
(Qm) in R with limp,+.0Qm = 0. Determine m = m(6) as the smallest integer 
m with ||Ka%~ — y°||y <6. For details, we refer the reader to [89] or [182]. 

One can construct more general classes of methods through the spectral 
representation of the solution x*. 

Comparing the regularizer 2° of Landweber’s method with the true solu- 
tion «*, we observe that the function ¢() = 1/u, «4 > 0 is approximated by 
the polynomial P,,, (1) = [1 — (1 — ay?) | /y. It is certainly possible to choose 
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better polynomial approximations of the function pp +> 1/p. Orthogonal poly- 
nomials are particularly useful. This leads to the v-methods; see [21, 118], or 
[120]. 

A common feature of these methods that is very crucial in the analysis is 
the fact that all of the polynomials P,, are independent of y and y°. For the 
important conjugate gradient algorithm discussed in the next section, this is 
not the case, and that makes an error analysis much more difficult to obtain. 


2.7 The Conjugate Gradient Method 


In this section, we study the regularizing properties of the conjugate gradient 
method. Because the proofs of the theorems are rather technical, we only state 
the results and transfer the proofs to the Appendix B. 

First, we recall the conjugate gradient method for least squares problems 
for overdetermined systems of linear equations of the form Ka = y. Here, 
Kk € R™*” and y € R™ with m > n are given. Because it is hopeless to sat- 
isfy all equations simultaneously, one minimizes the defect f(a) := ||Ka— y|l?, 
xz € R", where ||- || denotes the Euclidean norm in R™. Standard algorithms for 
solving least squares problems are the QR-algorithm or the conjugate gradient 
method; see [71, 106, 132]. Because we assume that the latter is known for sys- 
tems of equations, we formulate it now (in Figure 2.2) for the operator equation 
Ka =y, where K : X — Y is a bounded, linear, and injective operator between 
Hilbert spaces X and Y with adjoint kK*:Y > X. 


Theorem 2.20 (Fletcher—Reeves) 

Let K : X — Y be a compact, linear, and injective operator between Hilbert 
spaces X andY. Then the conjugate gradient method is well-defined and either 
stops or produces sequences (x), (p™) in X with the properties 


(Vee Vie). = 0. gorally Am (2.41a) 


and 


(Kp™, Kp’)y = 0. forallj Am; (2.41b) 


that is, the gradients are orthogonal and the directions p™ are K-conjugate. 
Furthermore, 


(Vf (2), K*Kp™), = 0. for all j <m. (2.41c) 
Define again the function 
f(x) = ||K2—yll} = (Ka—-y,Ka—y)y, ceX. 


We abbreviate Vf(z) := 2K*(Ka —y) € X and note that Vf(zx) is indeed 
the Riesz representation (see Theorem A.23) of the Fréchet derivative f’() 
of f at x (see Lemma 2.14). We call two elements p,q € X K-conjugate if 
(Kp, Kq)y = 0. If K is one-to-one, this bilinear form has the properties of an 
inner product on X. 
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The following theorem gives an interesting and different interpretation of 
the elements 2”. 


xv :=0 
m=0 
Y 
K*y=0? yes _.] sTOP 
no 
Y 
p= —K*y 
Y 
p= (Ka™ — y, Kp™)y 
= |p ||} 
gmtl = 7m — tmp” 
Y 
K*(Ka™t! —y)=0? yes _.} sTOP 
no 
Y 
_ |K*(Kam — yl 
|K* (Kam — y)|l 
pert = K* (Kam! —y) + mp™ 


m:=m+i1 


Figure 2.2: The conjugate gradient method 


Theorem 2.21 Let (2) and (p™) be the sequences of the conjugate gradient 
method. Define the space Vm := span{p®,...,p™}. Then we have the following 
equivalent characterizations of Vin: 


Vig = apa VPS pace VIS) + (2.42a) 
= span pe AK py (KK) pt (2.42b) 
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form=0,1,.... The spaces V,, are called Krylov spaces. Furthermore, x™ is 
the minimum of f on Vm—1 for every m > 1. 


By this result, we can write x” in the form 
g”™ = —Pm_1(K*K)p® = Pm_i(K*K)K*y (2.43) 


with a well-defined polynomial P,,_1 € Pm_—1 of degree at most m— 1 (which 
depends itself on y). Analogously, we write the defect in the form 


y—Ke™ = y-—KPn-i(K*K)K*y = y—KRK*Py_-1(KK*)y 
Qn(KK™)y 


with the polynomial Q,,(t) := 1—tPm-~1(t) of degree m. 
Let {u;,2;,y; : 7 € N} be a singular system for K. If it happens that 


N 
LS S054; € Ww :=span {y1,.--,yn} 
j=l 
for some N €N, then all iterates x” € Ay := span {x1,...,2~} because 
N 
wv” = Pmi(K*K)K*y = S705 Pm—1(Hj) My @5 - 
j=l 


In this exceptional case, the algorithm terminates after at most N iterations 
because the dimension of Ay is at most N and the gradients Vf(a") are 
orthogonal to each other. This is the reason why the conjugate gradient method 
applied to matrix equations stops after finitely many iterations. For oper- 
ator equations in infinite-dimensional Hilbert spaces, this method produces 
sequences of, in general, infinitely many elements. 

The following characterizations of Q,, are useful. 


Lemma 2.22 (a) The polynomial Qm minimizes the functional 
H(Q) = ||QKK*)ylly on {QE Pm : Q(0) = 1} 


and satisfies 
H(Qm) = ||Ka"™ — ylly- 
(b) Fork # €, the following orthogonality relation holds: 


(Qk, Qe) := DHF QF) Qe) [(yws)v | = 0. (2.44) 


Ify ¢ span {y1,...,yn} for any N EN, then (-,-) defines an inner product on 
the space P of all polynomials. 
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Without a priori information, the sequence (7) does not converge to the 
solution « of Kx = y. The images, however, do converge to y. This is the 
subject of the next theorem. 


Theorem 2.23 Let K and K* be one-to-one, and assume that the conjugate 
gradient method does not stop after finitely many steps. Then 


Ka™—>y as m>oo 


for everyy EY. 


We give a proof of this theorem because it is a simple conclusion of the previous 
lemma. 


Proof: Let Q€ Pm be an arbitrary polynomial with Q(0) = 1. Then, by the 
previous lemma, 


Ka” — yl = H(Qm) < HQ) = S°QU3)7 | y)v/P. (2-45) 


Now let ¢ > 0 be arbitrary. Choose 7; € N such that 


~ 2 ge 
J=jit1 


and choose a function R € C[0, 7] with R(0) = 1, ||R\lo < 1 and R(u5) = 0 


for 7 = 1,2,...,j71. By the theorem of Weierstrass, there exist_ polynomials 
Qm € Pm with ||R- Qmlloc + 0asm-— oo. We set On = Qmn/Qm(O ), which 
is possible for sufficiently large m because R(0) = 1. Then Om converges to R 
as m — oo and Q,,(0) = 1. Substituting this into (2.45) yields 


H(Qm) 


3a (3) Rui) Ploy + WOnll2. >| ay? 


J>d1 


IA 


Ka” — ylly 


IA 


oot 


2 
n Bek 
S 1Qm— Rilecllully + FllQmle- 


This expression is less than ¢? for sufficiently large m. 


Now we return to the regularization of the operator equation Ka* = y*. The 
operator P,,-1(K*K)K* : Y — X corresponds to the regularization operator 
Ra of the general theory. But this operator certainly depends on the right-hand 
side y. The mapping y > P,,_-1(K*K)K“*y is therefore nonlinear. 

So far, we have formulated and studied the conjugate gradient method for 
unperturbed right-hand sides. Now we consider the situation where we know 
only an approximation y° of y* such that ||y®? — y*||y < 6. We apply the 
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algorithm to y® instead of y. This yields a sequence x”® and polynomials P>, 
and Q%,. There is no a priori strategy m = m(6) such that x). converges to 
x* as 6 tends to zero; see [78]. An a posteriori choice as in the previous section, 
however, again leads to an optimal strategy. We stop the algorithm with the 
smallest m = m(6d) such that the defect ||Ka™? — y°\|y < ré, where r > 1 
is some given parameter. From now on, we make the assumption that y° is 
never a finite linear combination of the y;. Then, by Theorem 2.23, the defect 
tends to zero, and this stopping rule is well-defined. We want to show that the 
choice m = m(6) leads to an optimal algorithm. The following analysis, which 
we learned from [121] (see also [226]), is more elementary than, for example, in 
(21, 181], or [182]. We carry out the complete analysis but, again, postpone the 
proofs to Appendix B because they are rather technical. 
We recall that by our stopping rule 


Ka? — yy < rh < /KamO-b? — yy, (2.46) 


The following theorem establishes the optimality of the conjugate gradient 
method with this stopping rule. 


Theorem 2.24 Assume that y* = Kx* and y® do not belong to the linear span 
of finitely many y;. Let the sequence a) be constructed by the conjugate 
gradient method with stopping rule (2.46) for fixed parameter r > 1. Let x* = 
(K*K)?/?2z € R((K*K)?/?) for some o > 0 and z € X. Then there exists 
c> 0 with 


[a7 — aly < cd7/ot) BUF) | (2.47) 
where EF = |\z||x. 


As Landweber’s method and the regularization method by the spectral cut- 
off (but in contrast to Tikhonov’s method), the conjugate gradient method is 
optimal for all o > 0 under the a priori information ||(K*K)~?/22*||x < E. 

There is a much simpler implementation of the conjugate gradient method 
for self-adjoint positive definite operators K : X — X. For such K, there exists 
a unique self-adjoint positive operator A: X > X with A? = K. Let Kx = y 
and set z := Az, that is, Az = y. We apply the conjugate gradient method to 
the equation Ax = z without knowing z. In the process of the algorithm, only 
the elements A*z = y, ||Ap™||? = (Kp™,p™), and A*(Ar™ — z) = Ka™—y 
have to be computed. The square root A and the quantity z do not have to be 
known explicitly, and the method is much simpler to implement. 

Actually, the conjugate gradient method presented here is only one member 
of a large class of conjugate gradient methods. For a detailed study of these 
methods in connection with ill-posed problems, we refer to [104, 124, 207, 208, 
210] and, in particular, the work [122]. 
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2.8 


2.1 


2.2 


2.3 


2.4 


Regularization Theory 


Problems 


Let K : L?(0,1) + L?(0,1) be the integral operator 
1 
(K2)(t) = fo +ts)ea2(s)ds, 0<t<1. 
0 


Show by induction that 


1 

d” 

qn (K2)(t) = (rt tts) 9 el 2s) ds 0<t<1,n=0,1,... . 
0 


Prove that K is one-to-one and that the constant functions do not belong 
to the range of K. 


Apply Tikhonov’s method of Section 2.2 to the integral equation 


[eos = y(t), O<t<l. 


Prove that for y € H+(0,1) with y(0) = 0, Tikhonov’s solution x° is given 
by the solution of the boundary value problem 


—acn'"(t) + a(t) = y(t), 0<t<1, 2z(1)=0, 2'(0)=0. 


Let kK : X — Y be compact and one-to-one. For any o > 0 let the spaces 
X, be defined by X, = RAE Ke) equipped with the inner product 


((K*K)?? 21, (K*K)? 22) = (21,22)x, 21,22EX. 


Prove that X, are Hilbert spaces and that X,, is compactly embedded in 
X,, for 02 > 04. 


The iterated Tikhonov regularization 2” of order m € N (see [87, 153]) 
is iteratively defined by 


008 = 0, (al + K*K)g™thad = K*y? 4 Q 7% 


for m = 0,1,2,... . (Note that 2! is the ordinary Tikhonov regular- 
ization.) 
(a) Show that q”) (a, ps) == 1— (atz)” is the corresponding filter func- 


tion. 


(b) Prove that this filter function leads to a regularizing operator RY 
with Re ILeax) < m/(2\/a), and q” satisfies (2.9) from Theorem 2.7 
with 


wl) (q) = cqminte/2,m} 
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2.5 


2.6 


where c depends only on m and a. 


(c) Show that the iterated Tikhonov regularization of order m is asymp- 
totically optimal under the information ||(K*K)~?/22*||x < E for every 
o<2m. 


Fix y° with ||y* — y*||y <6 and let x®* be the Tikhonov solution corre- 
sponding to a > 0. The curve 


f(a) ) ( ||Kae? — y? lly ) 
arh/ — a ; a = 0, 
( g(a) Ila? I 
in R? is called an L-curve because it has often the shape of the letter L; 
see [90, 125, 127]. 


Show by using a singular system that f’(a) = —ag’(a). Furthermore, 
compute the curvature 


») = Wo)9"(a) ~ 9) "0 
(iia)? + 9'(a)?)> 


and show that the curvature increases monotonically for 0 < a < 
ae 


Show that 


— 


myr(1 — ap??? < fi for allm >landO0<w< 


ale 


Check for 
updates 


Chapter 3 


Regularization by 
Discretization 


In this chapter, we study a different approach to regularizing operator equations 
of the form Ka = y, where x and y are elements of certain function spaces. 
This approach is motivated by the fact that for the numerical treatment of 
such equations, one has to discretize the continuous problem and reduce it to 
a finite system of (linear or nonlinear) equations. We see in this chapter that 
the discretization schemes themselves are regularization strategies in the sense 
of Chapter 2. 

In Section 3.1, we study the general concept of projection methods and give 
a necessary and sufficient condition for convergence. Although we have in mind 
the treatment of integral equations of the first kind, we treat the general case 
where K is a linear, bounded, not necessarily compact operator between (real 
or complex) Banach or Hilbert spaces. Section 3.2 is devoted to Galerkin meth- 
ods. As special cases, we study least squares and dual least squares methods 
in Subsections 3.2.1 and 3.2.2. In Subsection 3.2.3, we investigate the Bubnov— 
Galerkin method for the case where the operator satisfies Garding’s inequality. 
In Section 3.3, we illustrate the Galerkin methods for Symm/’s integral equation 
of the first kind. This equation arises in potential theory and serves as a model 
equation for more complicated situations. Section 3.4 is devoted to colloca- 
tion methods. We restrict ourselves to the moment method in Subsection 3.4.1 
and to collocation by piecewise constant functions in Subsection 3.4.2, where 
the analysis is carried out only for Symm’s integral equation. In Section 3.5, 
we present numerical results for various regularization techniques (Tikhonov, 
Landweber, conjugate gradient, projection, and collocation methods) tested for 
Dirichlet boundary value problems for the Laplacian in an ellipse. Finally, we 
study the Backus—Gilbert method in Section 3.6. Although not very popular 
among mathematicians, this method is extensively used by scientists in geo- 
physics and other applied sciences. The general ideas of Sections 3.1 and 3.2 
can also be found in, for example, [17, 168, 182, 226]. 
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3.1 Projection Methods 


First, we recall the definition of a projection operator. 


Definition 3.1 Let X be a normed space over the field K where K = R or 
K=C. LetU C X be a closed subspace. A linear bounded operator P : X + X 
is called a projection operator on U if 


e Px €U for alla € X and 
e Pr=2 for alla €U. 
We now summarize some obvious properties of projection operators. 


Theorem 3.2 Every nontrivial projection operator satisfies P? = P and 
|Pllecx) 2 1. 


Proof: P®x = P(Px) = Px follows from Px € U. Furthermore, ||P||c(x) = 
PP llecx) < IIPilZcx) and P #0. This implies ||P||c(x) > 1. 


In the following two examples, we introduce the most important projection 
operators. 


Example 3.3 

(a) (Orthogonal projection) Let X be a pre-Hilbert space over K = Ror K=C 
and U Cc X be acomplete subspace. Let Px € U be the best approximation to 
xz in U; that is, Px satisfies 


||Px-—allx < |llu-alx foralueU. (3.1) 


By the projection theorem (Theorem A.13 of Appendix A.1), P: X > U is 
linear and Px € U is characterized by the abstract “normal equation” (a — 
Px,u)x = 0 for all u € U, that is, x — Px € U+. In this example, by the 
binomial theorem we have 


IIe = [Pa + (x —Pa)llx 
[Pall + ||2- Pall + 2Re(a—Px,Px)x > ||Pallx, 
SS 


=0 


that is, ||P||c(x) = 1. Important examples of subspaces U are spaces of splines 
or finite elements. 


(b) (Interpolation operator) Let X = C|a, b] be the space of real-valued continu- 
ous functions on [a, 6] supplied with the supremum norm. Then X is a normed 
space over R. Let U = span{u,...,Un} be an n-dimensional subspace and 
ty,...,tn € [a,b] such that the interpolation problem in U is uniquely solvable; 
that is, det[u;(tx)] #0. We define Px € U by the interpolant of x € Cla, b] in 
U, ie. u= Px € U satisfies u(t;) = x(t;) forallj =1,...,n. Then P: X 4 U0 
is a projection operator. 
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Examples for U are spaces of algebraic or trigonometric polynomials. As a 
drawback of these choices, we note that from the results of Faber (see [203]) the 
interpolating polynomials of continuous functions x do not, in general, converge 
to x as the degree of the polynomials tends to infinity. For smooth periodic 
functions, however, trigonometric interpolation at equidistant points converges 
with an optimal order of convergence. We use this fact in Subsection 3.4.2. 
Here, as an example, we recall the interpolation by linear splines. For simplic- 
ity, we formulate only the case where the endpoints are included in the set of 
interpolation points. 

Let a=t, <-+: <t, =b be given points, and let U C C{a, b] be defined by 


U = Giltigesay ty) 


= fe € Cla, b] : le, t:41] © Pas J =1,---.—- it, (3.2) 


where P; denotes the space of polynomials of degree at most one. Then the 
interpolation operator Q,, : Cla, b] > Si(t1,...,tn) is given by 


n 


Qnxz = Salts) 9 for « € C[a, }], 


j=1 


where the basis functions y; € Si(ti,...,t,) are defined by 


t— tj 
i t2 bag) G9 SD), 
t; = tj-1 
Ns ~ tj41 —t ies 
wD = ees té [tj tial (if7<n—-1), a8) 
j+1 J 
0, be ete, 
for j =1,...,n. In this example ||Qn||c(c{a,s}) = 1 (see Problem 3.1). 
For general interpolation operators, ||Qn||c(x) exceeds one and ||Qn||ccx) 


does not have to be bounded with respect to n. 
Now we define the class of projection methods. 


Definition 3.4 Let X and Y be Banach spaces and K : X — Y be bounded 
and one-to-one. Furthermore, let X, C X and Y, C Y be finite-dimensional 
subspaces of dimension n and Q,:Y — Y, be a projection operator. For given 
y € Y, the projection method for solving the equation Kx = y is to solve the 
equation 


QnKkan=Qny fortn€ Xn- (3.4) 
Equation (3.4) reduces to a finite linear system by choosing bases {#1,...,@n} 
and {1,..., $n} of X, and Y,, respectively. One possibility is to represent Qny 
and every Q,K2;, 7 =1,...,n, in the forms 


Qny = S> Bi Gi and Q,nKaj = > Ass His j=1,--.n, (3.5) 


i=1 i=l 
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with 6;, Ai; € K. The linear combination 2, = )7_, aj#; solves (3.4) if and 
only if a = (a1,...,)' € K” solves the finite system of linear equations 


S 5 Aig 5 = Bi, t= Dye eS that is, Aa=f. (3.6) 
j=l 


There is a second possibility to reduce (3.4) to a finite system which is used for 
Galerkin methods. Let Y* be the dual space of Y with dual pairing (-,-)y* y. 
We choose elements y7 € Y* for i =1,...,n, such that the matrix Y € K"*” 
given by Yi; = (97,9;)y*y is regular. Then, representing x, again as t, = 
Vya1 24; equation (3.4) is equivalent to 


S > Aig 0; = Bi, = 1b nts that is, Aa=f, (3.7) 
j=l 

where now 
Aiy = (67,QnKG;)y*y and 6 = (G7 ,Qny)y-y- (3.8) 


The orthogonal projection and the interpolation operator from Example 3.3 
lead to the following important classes of projection methods, which are studied 
in more detail in the next sections. 


Example 3.5 

Let kK : X > Y be bounded and one-to-one. 

(a) (Galerkin method) Let X and Y be pre-Hilbert spaces and X,, C X and Y;, C 
Y be finite-dimensional subspaces with dim X, = dim Y,, = n. Let Qn : Y > Yn 
be the orthogonal projection. Then the projected equation Q, Ka, = Qny is 


equivalent to 
(Kan,2n)y = (y,2n)y for all z, € Y,. (3.9a) 


Again let X, = span{%#1,...,@,} and Y, = span{q,...,G,}. Looking for a 
solution of (3.9a) in the form x, = 0%, aj @; leads to the system 


S (0; (K2;,hi)y = (yii)y fori=l1,...,n, (3.9b) 


21 


or Aa = f, where Aj; := (K&;,%:)y and 8; = (y,9i)y. This corresponds to 
(3.7) with 9 = 9%; after the identification of Y* with Y (Theorem A.23 of Riesz). 


(b) (Collocation method) Let X be a Banach space, Y = Cla, b], and K : X > 
Cla, b] be a bounded operator. Let a = ty < --- < th = b be given points 
(collocation points) and Y;, = Si(ti,...,tn) be the corresponding space (3.2) of 
linear splines with interpolation operator Qny = i y(t;) gj. Let y € C[a, d 
and some n-dimensional subspace X,, C X be given. Then Q, Kan = Qny is 
equivalent to 

(Ka,)(t;) = y(t;) for alli=1,...,n. (3.10a) 
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If we denote by {#1,...,@n} a basis of X,,, then looking for a solution of (3.10a) 


in the form z, = Ps, aj;%; leads to the finite linear system 


\ a hajG) = Way t= Tg (3.10b) 


or Aa = B, where Ai; _ (Ka, )G,) and 2; = y(t;). 
We are particularly interested in the study of integral equations of the form 


b 


ees) x(s)ds = y(t), t€ [a,d], (3.11) 


a 


in L?(a,b) or Cla, b| for some continuous or weakly singular function k. (3.9b) 
and (3.10b) now take the form 


dan Bi (3.12) 
where z= )0_, aj; and 
bb 
de. = 7 i OE OPNOREE. (3.13a) 
. a 
— / y(t) Gilt) dt (3.13b) 
for the Galerkin method, and 
b 
Ay. & : EOE (3.13¢) 
B = y(t) (3.13d) 


for the collocation method. 

Comparing the systems of equations in (3.12), we observe that the computa- 
tion of the matrix elements (3.13c) is less expensive than for those of (3.13a) due 
to the double integration for every matrix element in (3.13a). For this reason, 
collocation methods are generally easier to implement than Galerkin methods. 
On the other hand, Galerkin methods have convergence properties of high order 
in weak norms (superconvergence) which are of practical importance in many 
cases, such as boundary element methods for the solution of boundary value 
problems. 


For the remaining part of this section, we make the following assumption. 
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Assumption 3.6 Let K : X > Y be a linear, bounded, and injective operator 
between Banach spaces, X, C X and Y, C Y be finite-dimensional subspaces 
of dimension n, and Qn: Y — Y;, be a projection operator. We assume that 
Unen X, is dense in X and that Q,K|x, : Xn 3 Yn is one-to-one and thus 
invertible. Let x € X be the solution of 


Kau = y. (3.14) 
By tn € Xp, we denote the unique solutions of the equations 


forneN. 


We can represent the solutions x, € X, of (3.15) in the form az, = Rny, 
where R,,: Y ~ X, C X is defined by 


Rn := (QnK|x,) Qn: ¥ > XnCX. (3.16) 


The projection method is called convergent if the approximate solutions rp, € Xn, 
of (3.15) converge to the exact solution « € X of (3.14) for every y € R(K), 
that is, if 

R,Kr = (QnK|x,) QnKx—>2, noo, (3.17) 


for every x € X. 

We observe that this definition of convergence coincides with Definition 2.1 of 
a regularization strategy for the equation Ka = y with regularization parameter 
a =1/n. Therefore, the projection method converges if and only if R, is a 
regularization strategy for the equation Ka = y. 

Obviously, we can only expect convergence if we require that U,cy Xn is 
dense in X and Qny > y for all y € R(K). But, in general, this is not sufficient 
for convergence if K is compact. We have to assume the following boundedness 
condition. 


Theorem 3.7 Let Assumption 3.6 be satisfied. The solution x, = Rny € Xn 
of (3.15) converges to x for every y = Ka if and only if there exists c>0 such 
that 


|RnK|lccxx) < ¢ forallneN. (3.18) 
If (3.18) is satisfied the following error estimate holds: 


lIzn —allx S (+e) min |l2n — allx (3.19) 


with the same constant c as in (3.18). 


Proof: Let the projection method be convergent. Then R, Ka — x for every 
x € X. The assertion follows directly from the principle of uniform boundedness 
(Theorem A.28 of Appendix A.3). 
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Now let ||R,,K’||c(x) be bounded. The operator R,, K is a projection operator 


onto X,, because for z, € Xn, we have R, Kz, = (QnK|x,) QnK2n = Zin: 
Thus we conclude that 


t—-a2 = (R,K-Te = (R,K-—I)\(a@—2,) for all z, € Xy. 
This yields 
tn — 2x < (c+1)|la—2n||x for all 2, € Xn 


and proves (3.19). Convergence x, — x follows because L),-. Xn is dense in 


Xx. 


neNn 


We show now a perturbation result: It is sufficient to study the question 
of convergence for the “principal part” of an operator K. In particular, if the 
projection method converges for an operator kK, then convergence and the error 
estimates hold also for K +C, where C is compact relative to K (that is, K~'C 
is compact). 


Theorem 3.8 Let Assumption 3.6 hold. Let C : X — Y be a linear operator 
with R(C) C R(K) such that K +C is one-to-one and K~‘C is compact in X. 
Assume, furthermore, that the projection method converges for K, that is, that 
R,Ka— « for every x € X, where again 


Ry = [QnK|x,] Qn- 


Then it converges also for K + C; that is, Rn(K + C)x > x for every x € X, 
where now 


Rn = [Qn(K + C)|x,]~ Qn- 


Let x* € X be the solution of (K + C)x* = y* and fp, = Rny* € Xn be the 
solution of the corresponding projected equation Qn(K + C)in = Qny*. Then 
there exists c > 0 with 


Zn —2* |x < c[l|Rny* — K7*y* |x + ||RrCz* —K7'Cx*||x] (3.20) 
for all sufficiently large n. 


Proof: — First we note that I+ K~'C = K~!(K + C) is one-to-one and thus, 
because of the compactness of K~!C, an isomorphism from X onto itself (see 
Theorem A.36). We write the equation Q,(K+C)%, = Qny* as T+ R,C)&n = 
Rny* and have 


(I+R,C)(Gn—2*) = Rny*-R,C2*—a* = [Ryy*—K~*y"|—-[RyCa*—-K~*C2*] 


where we have used that 2* + K~'Cx* = K~1ty*. Now we note that the oper- 
ators R,C = [R,K]K~'C converge to K~'C in the operator norm because 
R, Kx — x for every x € X and K~!C is compact in X (see Theorem A.34, part 
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(d), of Appendix A.3). Therefore, by the general Theorem A.37 of Appendix A.3, 
we conclude that (J + R,C)~! exist for sufficiently large n and their operator 
norms are uniformly bounded by some c > 0. This yields the estimate (3.20). 


The first term on the right-hand side of (3.20) is just the error of the projec- 
tion method for the equation Ka = y*, the second for the equation Ka = Ca*. 
By the previous Theorem 3.7, these terms are estimated by 


|Rny* —K~“y"IIx < (1+e) “oe |K~"y* — znllx , 


|RnCa* — K-'C2*\|lx < (1+0) min |K-"Ca* — znl|x 


where c is a bound of ||RpK||c(x). 


So far, we have considered the case where the right-hand side y* = Ka* is 

known exactly. Now we study the case where the right-hand side is known only 
approximately. We understand the operator R,, from (3.16) as a regularization 
operator in the sense of the previous chapter. We have to distinguish between 
two kinds of errors on the right-hand side. The first kind corresponds to the 
kind of perturbation discussed in Chapter 2. Instead of the exact right-hand 
side y*, only y°® € Y is given with ||y° — y*||y < 6. We call this the contin- 
uous perturbation of the right-hand side. A simple application of the triangle 
inequality yields the following result. 
Theorem 3.9 Let Assumption 3.6 be satisfied and let again Rn = (Qn K|x,,)~+ 
Qn: Y > Xn C X as in (3.16). Let x* € X the solution of the unperturbed 
equation Ka* = y*. Furthermore, we assume that the projection method con- 
verges; that is by Theorem 3.7, \|RnK||ccx) are uniformly bounded with respect 
ton. Furthermore, let y° € Y with |ly° —y*|ly <6 and a> = Rny® the solution 
of the projected equation QnKx?, =Qny*®. Then 


lItn — Ray" Ix + [|Rny* — 2° |x 
|Rallewxlly? — ally + ||RnKa*—a*lx. (3.21) 


len — 2*llx 


IN IA 


This estimate corresponds to the fundamental estimate from (2.4a). The 
first term reflects the ill-posedness of the equation: The (continuous) error 6 of 
the right-hand side is multiplied by the norm of R,,. The second term describes 
the discretization error for exact data. 


In practice, one solves the discrete systems (3.6) or (3.7) where the coef- 
ficients 3; are replaced by perturbed coefficients 3? € K; that is, one solves 


SoAge) = 2), Fal, thatts,. da’ =p", (3.22) 
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instead of Aa = @ where now 
\e°- B? = >> |e? -BiP? < 3. 
i=l 


Recall, that the elements A;; of the matrix A € K”*” and the exact coefficients 
GB; of B € K” are given by (3.5) or (3.8). We call this the discrete perturbation 
of the right-hand side. Then 2° € X;, is defined by 


n 


5 5x 
LL, = y Cu; Li 
j=l 


In this case, the choices of basis functions #; € X, and y; € Y, (and the dual 
basis functions 7) are essential. We will also see that the condition number of 
A reflects the ill-conditioning of the equation Kx = y. We do not carry out the 
analysis for these two forms of discrete perturbations in the general case but do 
it only for Galerkin methods in the next section. 


3.2. Galerkin Methods 


In this section, we assume that X and Y are (real or complex) Hilbert spaces; 
Kk : X — Y is linear, bounded, and one-to-one; X, C X and Y, C Y are 
finite-dimensional subspaces with dim X, = dimY, = n; and Qn, : Y > Yn 
is the orthogonal projection operator onto Y,. Then equation Q, Ka, = Qny 
reduces to the Galerkin equations (see Example 3.5) 


(Kan,2n)y = (y,2n)y forall z, € Y,. (3.23) 


If we choose bases {%1,...,%n} and {t1,...,9n}of Xp and Y,,, respectively, then 
this leads to a finite system for the coefficients of x, = 24 a,;£; (compare 
with (3.9b)): 


i=1 


where 
Ay = (Ka;,%:)y and 2B = (y,%)y. (3.25) 


The Galerkin method is also known as the Petrov-Galerkin method (see [216]) 
because Petrov was the first to consider the general situation of (3.23). The 
special case X = Y and X,, = Y, was studied by Bubnov in 1913 and later by 
Galerkin in 1915 (see [99]). For this reason, this special case is also known as 
the Bubnov-Galerkin method. In the case when the operator K is self-adjoint 
and positive definite, we will see that the Bubnov—Galerkin method coincides 
with the Rayleigh—Ritz method; see [221] and [228]. 
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Theorem 3.10 Let Assumption 38.6 be satisfied and let again R, = 
(QnK|x,)1Qn : Y 4 Xn C X as in (3.16). Let x* € X the solution of 
the unperturbed equation Ka* = y*. Furthermore, we assume that the Galerkin 
method converges; that is by Theorem 3.7, ||RnK||c(x) are uniformly bounded 
with respect to n. 


(a) Let y® € Y with |ly® — y*\ly < 5 and x = Rny® the solution of the 
projected equation Q,Kx> = Qny®. Then 


Iz, —a*llx < |Rnllew.x) 5 + ||RnKa* — 2* |x. (3.26) 


b) Let Qny* = 32"_, Bigs and 88 € K with |G°—B| = a> — Bi]? < 6 and 
i=1 a st a 


let ao € K” be the solution of Aa = 8°. Then, with 2° = a Odi, 


- 5 + ||RypK2* —2*|lx, (3.27) 


lien —2*l_x S [|Ralleq.xy bnd + ||RnKa* —2*||x, (3.27b) 


%, = max 


len -—2*llx < 


where 


n 
> est; 
jai 


> la = i}. (3.28a) 
x g=1 


=1}, (3.28b) 
Y 


n n 
So loal? = || D2 pede 
i=l q=1, 


and Xr, > 0 denotes the smallest singular value of the matrix A. We note 
that if X or Y are Hilbert spaces and {%; : j = 1,...,n} or {jj : i= 


1,...,n}, respectively, are orthonormal systems then ap, = 1 or by, = 1, 
respectively. 
Proof: (a) has been shown before. 


(b) Again we use ||x2° — 2*||x < ||2° — Rny*||x + ||Rny* — x*||x and estimate 
the first term. We note that Rny* = )05_, aj%j with Aa = 6 and thus 


n 


So (a3 = a5) a; 
xX 


j=l 


dn|A~*(B° — B)| < ap|Aq*|2|8° — Bl < 


ll=n — Ray" Ix 


an 


Xn 


6, 


where |A~+|, denotes the spectral norm of A~', that is, the inverse of the 
smallest singular value of A. This yields (3.27a). 
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To prove (3.27b), we choose y> € Y;, such that (y°,9;)y = 6) fori =1,...,n. 
Then Rpy® = 2° and thus 


\la2 — Ray*\lx < WRallecxllys -— Qny"lly 
(ye, — Quy", 2n)y| 


= ||Rnlle~w.x) sup 


zn€EYn ll2n|ly 
pee Pi(ye = Qny*, 9;)¥ | 
— Ry L(Y,X) sup nD A 
pi Il oi PIT lly 


_ Rn L(Y,x) SUP ean PHOS B)| 
pi || aja Ps Gall 


: yt lal? 
< ||Rn L(Y,X) [3° — B| sup 2 
oe; Il j= PsUslly 


< |Rn L(Y,X) bn 6. 


This ends the proof. 


In the following three subsections, we derive error estimates for three special 
choices for the finite-dimensional subspaces X,, and Y,,. The cases where X,, and 
Y,, are coupled by Y,, = K(X,,) or Xp, = K*(Y,,) lead to the least squares method 
or the dual least squares method, respectively. Here, k* : Y — X denotes the 
adjoint of kK. In Subsection 3.2.3, we study the Bubnov—Galerkin method for 
the case where K satisfies Garding’s inequality. In all of the subsections, we 
formulate the Galerkin equations for the perturbed cases first without using 
particular bases and then with respect to given bases in X, and Y,. 


3.2.1 The Least Squares Method 


An obvious method to solve an equation of the kind Kaz = y is the following: 
Given a finite-dimensional subspace X,, C X, determine x, € X, such that 


||Kan,—ylly < ||Kz,-—ylly for all z, € X,. (3.29) 


Existence and uniqueness of x, € X,y, follow easily because X’,, is finite-dimensional 
and K is one-to-one. The solution x, € X, of this least squares problem is char- 
acterized by 


(Kan, K2n)y = (y,K2n)y for all z, € Xp. (3.30a) 


We observe that this method is a special case of the Galerkin method when we 
set Y, := K(Xn). 
Choosing a basis {@; : 7 =1,...,n} of X, leads to the finite system 


36 Ke, a = Bo = Bey Pri —1,..0;m, (3.30b) 


j=1 


74 Regularization by Discretization 


or Aa = G. This has the form of (3.25) for §; = K%@;. The corresponding matrix 
A €K"*" with Aj; = (K%;,K%;)y is symmetric (if K = R) or Hermitian (if 
K =C) and positive definite because K is also one-to-one. 

Again, we study the case where the exact right-hand side y* is perturbed by 
an error. For continuous perturbations, let 2° € X,, be the solution of 


(Ke) Ka )y = G Keay forall a € Xn (3.31a) 


where y® € Y is the perturbed right-hand side with ||y® — y*||y < 6. 

For the discrete perturbation, we assume that 6; = (y*, K%;)y, i =1,...,n, 
is replaced by a vector 6° € K” with |G° — B| < 5, where | -| denotes the 
Euclidean norm in K”. This leads to the following finite system of equations 


for the coefficients of 22 = )7"_, a? 


jai UGS: 


So) (Ba, Kay = B) fort deem (3.31b) 


This system is uniquely solvable because the matrix A is positive definite. For 
least squares methods, the boundedness condition (3.18) is not satisfied without 
additional assumptions. We refer to [246] or [168], Problem 17.2, for an example. 
However, we can prove the following theorem. 


Theorem 3.11 Let kK : X — Y be a linear, bounded, and injective operator 
between Hilbert spaces and X, C X be finite-dimensional subspaces such that 
Unen Xn ts dense in X. Let x* € X be the solution of Kx* = y* and ge X;, 
be the least squares solution from (3.31a) or (3.31b). Define 


On = max{||Znl|x + Zn € Xn, ||Kznlly = 1} (3.32) 
and let there exist c > 0, independent of n, such that 


min {||z — Zn||x + onl|K(a@—2n)\ly} < ellallx for allae X. (3.33) 


Then the least squares method is convergent and ||Rn\|ccy,x) < On. In this case, 
we have the error estimate 


|a* — a3 ||x < brond + @ min{ ||x* — 2n||x : zn € Xn} (3.34) 


for some @>0. Here, bn =1 if 2°, € Xp solves (3.310); that is, 5 measures the 
continuous perturbation ||y* — y*||y. If 6 measures the discrete error |3° — B| in 
the Euclidean norm and x) = he abe; € Xn, where a® solves (3.31b), then 
bn is given by 


Siar: lk(X oit;) | =1>. (3.35) 
j=l j=l Y 


If {@;:j =1,...,n} is an orthonormal basis of X;, then by = on. 
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Proof: It is the aim to apply Theorem 3.10. First we prove that || Rp K'||c:x) 
are bounded uniformly in n. Let x € X and rz, := R,Ka. Then 2x, satisfies 
(Kan, Kzn)y = (Kx, K2n)y for all z, € Xp. This yields 


|K(an — 2n)IIy = (K(@n — 2n), K(tn — 2n))y 
= (K( (Ca Zn); K (ay — in))y 
< ||K(e— 2y)Ily ||K (en — 2n)lly 


and thus ||K (an — 2n)||y < ||K (a — zn)||y for all z, € X,. Using this and the 
definition of o,,, we conclude that 


llr — Zallse Ss On||K (an — 2n)|ly < Gnl|K (2 — za) lly 
and thus 
Zn — 2n|lx + |l2n-2Ilx + |lzIlx 


Ilex + [llen-allx + onllK(a — z)lly]- 


[nlx 


< 
< 


This holds for all z, € X,. Taking the minimum, we have by Assumption (3.33) 
that ||rn||x < (1+ c)||z||x. Thus the boundedness condition (3.18) is satisfied. 
The application of Theorem 3.7 proves convergence. 

Analogously we prove the estimate for ||Rn||ccy,x). Let y € Y and set 
Ln = Ryy. Then from (3.30a), we have 


[Kany = (Kan, Kan)y = (y,Kan)y < |lylly Kenly 


and thus 
ll2n|lx < onl|Kanlly < onllylly - 
This proves the estimate ||R,||c¢y,x) < On- 


The error estimates (3.34) follow directly from Theorem 3.10 and the esti- 
mates (3.26) and (3.27b) for 9; = K&;. 


For further numerical aspects of least squares methods, we refer to [79, 82, 
107, 150, 188, 189, 201, 202]. 


3.2.2. The Dual Least Squares Method 


As the next example for the Galerkin method, we study the dual least squares 
method. We will see that the boundedness condition (3.18) is always satisfied. 
We assume in addition to the general assumptions of this section that the range 
R(K) of K is dense in Y. 

Given any finite-dimensional subspace Y,, C Y, determine u,, € Y, such that 


(KK*un,2n)y = (y,2n)y for all zn € Yn, (3.36) 


where K* : Y + X denotes the adjoint of kK. Then x, := K*uny is called the 
dual least squares solution. It is a special case of the Galerkin method when we 
set X, := K*(Y,). Writing equation (3.36) for y = Ka in the form 


(K*un, K*2n)x = (@,K*z,)x forall z, €Y,, 
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we observe that the dual least squares method is just the least squares method 
for the equation K*u = x. This explains the name. 

We assume again that the exact right-hand side is perturbed. Let y° € Y 
with ||y® — y*||y <6. Instead of equation (3.36), one determines x> = K*u® € 
X,, with 

(K*u®, K*zn)x = (y*,2zn)y for all zn € Yn. (3.37) 


For discrete perturbations, we choose a basis {j; : 7 = 1,...,n} of Y, and 
assume that the right-hand sides @; = (y*,§i)y, 1 = 1,...,n, of the Galerkin 
equations are perturbed by a vector 3° € K” with |G° — | < 6 where |-| denotes 
the Euclidean norm in K”. Instead of (3.36), we determine 


ah = Ktul = )abk*G,, 
j=l 
where a® € K” solves 
So af (KG, K*t)x = 6} fori=1,...,n. (3.38) 
j=l 


First we show that equations (3.37) and (3.38) are uniquely solvable. K* : Y > 
X is one-to-one because the range R(K) is dense in Y. Thus the dimensions 
of Y, and X, coincide and K* is an isomorphism from Y, onto Xy. It is 
sufficient to prove the uniqueness of a solution to (3.37). Let u, € Y, with 
(K*tn, K*2n)x = 0 for all z, € Y,. For z, = Un we conclude that 0 = 
(FO tin Kt, oe = [te |: that is, Ku, = 0 or ti, = 0: 

Convergence and error estimates are proven in the following theorem. 


Theorem 3.12 Let X and Y be Hilbert spaces and K : X — Y be linear, 
bounded, and one-to-one such that the range R(K) is dense in Y. Let Y, CY 
be finite-dimensional subspaces such that Ucn Yn is dense in Y. Let x* € X 
be the solution of Kx* = y*. Then the Galerkin equations (3.87) and (3.38) 
are uniquely solvable for every right-hand side and everyn € N. The dual least 
squares method is convergent and 


lRallew,x) < On = max{||Zn|ly > Zn © Yn, ||K*2n|lx = 1} : (3.39) 
Furthermore, we have the error estimates 
Iz* — 25 \|x < bnond + cmin{||2* — zn||x : zn € K*(Yn)} (3.40) 


for some c > 0. Here, bn = 1 if 2° € Xp solves (3.37); that is, 6 measures 
the norm |\y® — y*|ly in Y. If 6 measures the discrete error |G° — B| and «° = 
an al K*9; € Xn, where a® solves (3.38), then by is given by 


nm n 
So losl2 = | S5 0595 
j=l j=l 


We note that b, = 1 if {gj :j =1,...,n} forms an orthonormal system in Y. 


b, = max 


=1}. (3.41) 
Y 
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Proof: We have seen already that (3.37) and (3.38) are uniquely solvable for 
every right-hand side and every n € N. 

Now we prove the estimate ||R,K||c(x) < 1, that is condition (3.18) with 
c= 1. Let x € X and set 2, := R,Kax € X,. Then rz, = K*un, and un € Yn 
satisfies 

(K*un, K*2n)x = (Ka,2n)y for all zn, € Y,. 


For 2, = Un this implies 
lltnll = |K*unllk = (Ke,tn)y = (2, K*un)x < |lallx llenllx, 


which proves the desired estimate. If we replace Ka by y in the preceding 
arguments, we have 


Ilenll& < llylly Wetally < enllylly [*unllx = onllylly lenllx 


which proves (3.39). 

Finally, we show that U,,cxy Xn is dense in X. Let x € X and e > 0. Because 
K*(Y) is dense in X, there exists y € Y with ||a — K*y||x < ¢/2. Because 
Unen Yn is dense in Y, there exists y, € Y, with ||y—ynlly < €/(2||K|lccx,y))- 
The triangle inequality yields that for v7, := K*y, € Xn, 


lz —an|lx < |le—-K*yllx + ||K"(y—ynJllx < €. 


The application of Theorem 3.10 and the estimates (3.26) and (3.27b) proves 
(3.40). 


3.2.3. The Bubnov—Galerkin Method for Coercive 
Operators 
In this subsection, we assume that Y = X coincides, and K : X —> X is a linear 


and bounded operator and X,, n € N, are finite-dimensional subspaces. The 
Galerkin method reduces to the problem of determining x, € X, such that 


(Kan, 2n)x = (y,2n)x for all z, € Xp. (3.42) 


This special case is called the Bubnov-Galerkin method. Again, we consider two 
kinds of perturbations of the right-hand side. If y® € X with ||y® — y*||x <4 is 
a perturbed right-hand side, then instead of (3.42) we study the equation 


(Kao, zn)x = (y°,2n)x for all z € Xn. (3.43) 


The other possibility is to choose a basis {#; : j = 1,...,n} of X, and assume 
that the right-hand sides 8; = (y*,%:)x, i=1,...,n, of the Galerkin equations 
are perturbed by a vector 3° € K” with |° — @| < 5, where | -| denotes again 
the Euclidean norm in K”. In this case, instead of (3.42), we have to solve 


So) (Kee) = RE fort yay (3.44) 


j=1 
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for a® € K” and set 2° = yet ais 

Before we prove a convergence result for this method, we briefly describe the 
Rayleigh—Ritz method and show that it is a special case of the Bubnov—Galerkin 
method. 

Let K : X — X also be self-adjoint and positive definite, that is, (Ka2,y)x = 
(a, Ky)x and (Ka,x)x > 0 foralla,y € X with « 4 0. We define the functional 


W(z) = (Kz,z)x — 2Re(y,z)x forzEex. (3.45) 


From the equation 


W(z) — v(x) = 2Re(Ka-—y,z-2)x + (K(z-2),z-2)x (3.46) 


and the positivity of K, we easily conclude (see Problem 3.2) that « € X is 
the unique minimum of ~ if and only if x solves Kx = y. The Rayleigh-Ritz 
method is to minimize w over the finite-dimensional subspace X,,. From (3.46), 
we see that if x, € X, minimizes ~ on X,, then, for z, = ©, + eu, with 
Un € Xy and € > 0, we have that 


O< (Zn) _ W(tn) = 62 Re( Ka, — Yytn) x + e?(Kun, Un) x 


for all un, € Xp. Dividing by ¢ > 0 and letting « - 0 yields that x, € Xp 
satisfies the Galerkin equation (3.42). If, on the other hand, 2, € X,, solves 
(3.42), then from (3.46) 


(Zn) = (Zn) _ (K (Zn — En), Zn — En) x > 0 


for all z, € X». Therefore, the Rayleigh—Ritz method is identical to the 
Bubnov—Galerkin method. 


Now we generalize the Rayleigh—Ritz method and study the Bubnov—Galerkin 
method for the important class of coercive operators in Gelfand triples V C X C 
V'. For the definition of a Gelfand triple and coercive operators K : V' + V, we 
refer to Definition A.26 of Appendix A.3. We just recall that V’ = {€:V 5K: 
0 € V*} is the space of anti-linear functionals on V with corresponding bounded 
sesquilinear form (-,-) : V’ x V — K which extends the inner product in X, that 
is, (w,v) = (@,v)x for all v € V and x € X. Coercivity of an operator K from 
V’ into V means that there exists a constant 7 > 0 with 


|(z,Kx)| > yllallj for alla eV’. 


This condition implies that K is an isomorphism from V’ onto V; see the remark 
following Definition A.26. If V is compactly imbedded in X, that is, 7: VG X 
is compact, then K|x is a compact operator from X into itself. Therefore, in 
this subsection we measure the compactness by the “smoothing” of K from V’ 
onto V rather than by a decay of the singular values. 

Now we can prove the main theorem about convergence of the Bubnov— 
Galerkin method for coercive operators in Gelfand triples. 
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Theorem 3.13 Let V Cc X CV’ be a Gelfand triple, and X, C V be finite- 
dimensional subspaces such that Unen X,, is dense in X. Let K : V' > V be 
coercive in the sense of Definition A.26 with constant y > 0. Let x* € X be the 
solution of Kx* = y* for some y* © V. Then we have the following: 


(a) There exist unique solutions of the Galerkin equations (3.42)-(3.44). The 
Bubnov-Galerkin solutions tp, € Xp, of (8.42) converge in V’ with 


|jz* —anllve < ce min{||a* — z_|lv + zn € Xn} (3.47) 
for some c> 0. 
(b) Define py > 0 by 
Pn = max{||Zn||x + 2n € Xn, |lZnllve = 1} (3.48) 


and the orthogonal projection operator P, from X onto X,. The Bubnov- 
Galerkin solutions converge in X (rather than in the weaker norm of V') if 
there exists c > 0 with 


la -—Pyvlly. < —|lallx for allae X. (3.49) 
Pn 
In this case, we have the estimates 
02 
Raley < — (3.50) 


y 


and 


2 

Ic* —2°|lx < ¢ , Pag 4 min{ ||2z* — zn||x : 2n € x,}| (3.51) 
Y 

for some c > 0. Here, bn = 1 if 2° € Xp solves (3.43); that is, 6 measures 

the norm ||y®> — y*||x in X. If 6 measures the discrete error |3° — G| in the 

Euclidean norm and x?, = jai ae &; € Xn, where a solves (3.44), then by is 

given by 


n n 
by, = max » \a;|? : S a;4; =S1e): (3.52) 
j=l j=l x 
Again, we note that b, =1 if {@;:j =1,...,n} forms an orthonormal system 


in X. 


Proof: (a) We apply Theorem 3.7 to the equations Ka = y, x € V’, and 
P,Ka&n = Pry, n € Xn, where we consider K as an operator from V’ into V. 
We observe that the projection operator P,, is also bounded from V into X, 
where we consider X,, as a subspace of V. This follows from the observation 
that on the finite-dimensional space X,, the norms ||- || and ||-||v are equivalent 
and thus 


|Prully < ¢||[Prullx < ellullx <éllully forweV. 
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The constants c, and thus ¢, depend on n. Because V is dense in X and 
X is dense in V’, we conclude that also U,,cy Xn is dense in V’. To apply 
Theorem 3.7, we have to show that (3.42) is uniquely solvable in X,, and that 
RnK :V' > Xy, CV’ is uniformly bounded with respect to n. 

Because (3.42) is a finite-dimensional quadratic system, it is sufficient to 
prove uniqueness. Let x, € Xp, satisfy (3.42) for y = 0. Because K is coercive, 
we have 

Veal? < (tn. Kan)| = (tn, Kan) x| = Us 


thus x, = 0. 
Now let 2 € V’ and set x, = R, Ka. Then x, € X,, satisfies 


(Kan, 2n)x = (Ka2,2n)x for all zp € Xp. (3.53) 


Again, we conclude that 


IA 


ain, Kae, | = (Hg Kea )2e| 
= |(n,K2)x| = |(tn,Ka)| < ||Kellvllenllv: 


WNlenlly 


and thus 


1 1 
tally SS MKally SF [lKllewvyllelly. 


Because this holds for all x € V’, we conclude that 


1 
|Rrk levy < a EK llewsv) - 


Then the assumptions of Theorem 3.7 are satisfied for K : V’ > V. 


(b) In this part we wish to apply Theorem 3.10. Let « € X and z, = R, Kz. 
Using the estimates (3.47) and (3.49), we conclude that 
alr |Pa® = Ln||x 


la — ay||x x— Px 


< x 

< |lc-—Prz||x + pn||Pax — tn||w 

< |la-Prallx + pPrllPre—- ally + prlle — enllv’ 

< |ja-—Pr2llx + pnllPrx--2|lv. + Chm min || — znllv 
< |lja—Przllx + (c+1) prl|Pre - 2||v 

< lax + al[ellx = (2+e)[lz\x, 


and thus ||an||x < lan — 2||x + |lz|Lx < (34+ e1) ||2||x. Therefore, || Rp K||c¢x) 
are uniformly bounded. 

Next, we prove the estimate of R,, in £(X). Let y € X and x, = Rny. We 
estimate 


(ta, Ke,)| = (tng Aa) ox | = (tn, ¥)x| 
llyllx enllx < pnllyllx \lanllv: 


Wlenlly 


IAN IA 
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and thus 
1s 
ltn|lx < pallenlly: < 7 Pn llullx 


which proves the estimate (3.50). The application of Theorem 3.10 yields the 
estimate (3.51). 


From our general perturbation theorem (Theorem 3.8), we observe that the 
assumption of kK being coercive can be weakened. It is sufficient to assume that 
kK is one-to-one and satisfies Garding’s inequality. We formulate the result in 
the next theorem. 


Theorem 3.14 The assertions of Theorem 8.13 also hold if K : V’ > V is 
one-to-one and satisfies Garding’s inequality with some compact operator C' : 


Viv. 
For further reading, we refer to [204] and the monographs [17, 168, 182]. 


3.3 Application to Symm/’s Integral Equation of 
the First Kind 


In this section, we apply the Galerkin methods to an integral equation of the 
first kind that occurs in potential theory. We study the Dirichlet problem 
for harmonic functions, that is, solutions of the Laplace equation satisfying a 
boundary condition; that is, 


Au = 0 inQ, u=f ondQ, (3.54) 


where 2 C R? is some bounded, simply connected region with analytic boundary 
OQ and f € C(AQ) is some given function. The single layer potential 


u(z) = —= / $(y) nle-yldsy), weEO, (3.55) 
(ele) 


solves the boundary value problem (3.54) if and only if the density ¢ € C(0Q) 
solves Symm’s equation 


-= [ ow) ils) = FG) tera oO: (3.56) 
eke) 


see [53]. It is well-known (see [141]) that in general the corresponding integral 
operator is not one-to-one. One has to make assumptions on the transfinite 
diameter of 0; see [274]. We give a more elementary assumption in the following 
theorem. 


Theorem 3.15 Suppose there exists z9 € Q with |x — z| 4 1 for all x € OQ. 
Then the only solution 6 € C(OQ) of Symm’s equation (3.56) for f = 0 is 


o = 0; that is, the integral operator is one-to-one. 
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Proof: We give a more elementary proof than in [142], but we still need a 
few results from potential theory. 

From the continuity of z+ |x — zo|, we conclude that either |a — z9| < 1 for 
all x € OQ or |x — 29] > 1 for all  € OQ. Assume first that |a — z9| < 1 for all 
x € OO and choose a small disk A C Q with center zp such that |# — z| < 1 for 
all « € OQ and z € A. Let ¢ € C(AQ) satisfy (3.56) for f = 0 and define u by 


u(“z) = -= / o(y) In|a — y|ds(y) for « € R?. 


From potential theory (see [53]), we conclude that u is continuous in R?, har- 
monic in R? \ 0Q, and vanishes on 0Q. The maximum principle for harmonic 
functions implies that wu vanishes in 2. We show that wu also vanishes in the 
exterior 0° of 0. The main part is to prove that 


= f oy)asty) = 0. 
0Q 


Without loss of generality, we can assume that ¢ > 0. We study the harmonic 
function uv defined by 


— 


v(a@) = ula) + = In |a = 7] Hom *| ds(y), rE’, 


for some z € A. From the choice of A, we have 
v(“z) = : Inje—z| < 0 forreE dan. 
T 
We study the asymptotic behavior of v(x) as |x| tends to infinity. Elementary 
calculations show that 


|x — 2| = ‘ _ ee x|2 


for |x| — oo uniformly in y € OO, z € A, and @ := x/|az|. In particular, v(x) 
tends to zero as |a| tends to infinity. The maximum principle applied to v in 
Q¢ yields that v(a) < 0 for all 2 € N°. From the asymptotic formula and 
In(1 + €) = €+ O(e?), we conclude furthermore that 


v(x) = a58 fon y — z)ds(y) + O(1/|2|?) 


and thus 
fon y—z)ds(y) < 0 for all |@| =1. 
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This implies that 


/ $y) yds(y) = 2 / 6(y) ds(y) 
(ole) an 


Because this holds for all z € A, we conclude that f5, o(y) ds(y) = 0. 
Now we see from the definition of v (for any fixed z € A) that 


u(x) = v(z) > 0 as |z| >. 


The maximum principle again yields u = 0 in 0°. 
Finally, the jump conditions of the normal derivative of the single layer 
potential operator (see [53]) yield 


2¢(4) = im, [Vu(2 —ev(x)) — Vu(a+evr(x))]-v(xz) = 0 


for « € OQ, where v(x) denotes the unit normal vector at « € OQ directed into 
the exterior of 2. 

This ends the proof for the case that maxzeaq|x — 20| < 1. The case 
min,zcaq |x — Z| > 1 is settled by the same arguments. 


Now we assume that the boundary 02 has a parametrization of the form 
x=y(s), $s € [0,27], 
for some 27-periodic analytic function y : [0,27] + R? that satisfies |+(s)| > 0 
for all s € [0,27]. Then Symm’s equation (3.56) takes the form 


== [ 86) nr) —Y0o)]ds = g() = Fl) for te [0,27] (3.57) 


0 


for the transformed density ~(s) := $(7(s))|¥(s)|, s € [0, 27]. 
For the special case where 2) is the disk with center 0 and radius a > 0, we 
have 7a(s) = a (cos s,sin s) and thus 


1 t— 
In |ya(t) — Ya(s)| = Ina + 5 In (4 sin? os *) . (3.58) 
For general boundaries, we can split the kernel in the form 
1 1 t— 
~= In|y7(t) — 7(s)] = ——- In (4 sin? “—*) + k(t,s), t#s, (3.59) 
T 20 2 


for some function k that is analytic for t 4 s. From the mean value theorem, 
we conclude that 


: 1 ; 
lim k(t,s) = —— In|¥(@)|- 
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This implies that & has an analytic continuation onto [0,27] x [0,27]. With 
this, splitting the integral equation (3.57) takes the form 


2 fot (asi? 52) a0 + fateraanan = of (3.60) 
0 0 


for t € [0,27]. We want to apply the results of the previous section on Galerkin 
methods to this integral equation. 

As the Hilbert space X, we choose X = L?(0,27). The operators K, Ko, 
and C' are defined by 


(Kuyt) = —= f v(s) mja(e) — r(6)\ds, (3.618) 

(Kov)(t) = - [vo in € sin? =) = i ds,  (3.61b) 
0 

Cy = Ky — Ko (3.61c) 


for t € [0,27] and w € L?(0,27). First, we observe that K, Ko, and C are 
well-defined and compact operators in L?(0,27) because the kernels are weakly 
singular (see Theorem A.35 of Appendix A.3). They are also self-adjoint in 
L?(0,27). Then Symm’s equation (3.57) takes the form Kw = g in the Hilbert 
space L?(0, 27). 

As finite-dimensional subspaces X,, and Y;,, we choose the spaces of trun- 
cated Fourier series, that is, 


j=—n 
where w;(t) = e' for t € [0,27] and j € Z. The corresponding orthogonal 
projection operators P,, from L7(0,27) into X, = Y;, are given as 


n 


Pa = >> 05%; (3.63) 


j=—n 


where p= OH, aj. 
To investigate the mapping properties of K, we need the following technical 
result (see [168], Lemma 8.21). 


Lemma 3.16 
27 
1 - —1/|n ) ne Z, n # 0, 
— / e’”? In (4 sin? =) ds = (ln (3.64) 
Qn 2 0, n=0. 


0 
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Proof: It suffices to study the case n € No. First let n € N. Integrating the 
geometric sum 


n-1 
1+ ae es +e = i(1-e'”) cot-, O<s<2n, 
j=l 


2 
yields 
20 
: Ss : 
fe 1) cot=ds = 27i. 
0 
Integration of the identity 
as [(em* —1)ln (4sin® >) = ine’ In (4sin® =) + (e'”* —1) cot i 
ds 2 2 2 
yields 
20 20 9 
ins + 2 8 —_ math ins 8 —-, T 
fe In (4sin = | ds = mn [le 1) cot 5 ds — A? 
0 0 


which proves the assertion for n € N. 
It remains to study the case where n = 0. Define 


27 
i ic (4sin? =) ds. 
0 
Then we conclude that 
27 27 
jf = ic (4sin? =) ds + ic (4cos? 5) i 
0 
27 
= ic (16 sin? 5 cos” =) ds 
QT An 


I 


In(4sin? s) ds = 5 fs (4sin? =) ds = I 


i=} 
f=) 


and thus J = 0. 


This lemma shows that the functions 


in(t) = e'™, te[0,27], neZ, (3.65) 
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are eigenfunctions of Ko: 


Koln = rai forn #0 and (3.66a) 
Koby = do- (3.66b) 


Now can prove the mapping properties of the operators. 


Theorem 3.17 Suppose there exists z € Q with |x — z| 4 1 for all x € 
OQ. Let the operators K and Ko be given by (3.61a) and (3.61b), respectively. 
By Hj--(0,27), we denote the Sobolev spaces of order s (see Section A.4 of 
Appendix A). 


(a) The operators K and Ko can be extended to isomorphisms from 
Hf," (0,2) onto H§.,.(0,27) for every s € R. 


per 
(b) The operator Ko is coercive from Hyer!” (0,27) onto Hyde (0,2n). 


(c) The operator C = K — Ko is compact from eheeg (ie 27) into H5.,(0, 27) 
for everys ER. 


Proof: Let ~ € L7(0,27). Then w has the representation 
= S- a,e” with S- |an|? < 00. 
neZ neZ 
From (3.66a) and (3.66b), we have 
(Kop)(t) = ao + + Tae 
n#0 
and thus for any s € R: 


1 
lool? + DU +n?)" lanl? 
n¥40 


Kou. 


(b,Kow)12 = 2m | \aol? + ay, ra lain 


oan | 
> an So (L+n? ye heel — 2n|| Vp - 
neZ 
From the elementary estimate 
2\s—1 (1+n?)° (1+ n?)* = 2\s—1 
(1+7n*) S n2 S $(1 +n?) = 2(lt+n*), n#0, 


we see that Ko can be extended to an isomorphism from H%3,'(0,2m) onto 
Hy --(0,27) and is coercive for s = 1/2. The operator C is bounded from 
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Ayer(0, 27) into H;.,(0,27) for all r,s € R by Theorem A.47 of Appendix A.4. 
This proves part (c) and that K = Ko +C is bounded from Hs; (0, en) into 


per 


H;..,(0, 27). It remains to show that K is also an isomorphism from H;," (0,27) 
onto H;.,.(0,27). From the Riesz theory (Theorem A.36), it is sufficient to 
prove injectivity. Let y € Hs>*(0,27) with Ky = 0. — Kow = —Cw and 


the mapping properties of C’, we conclude that Koy € Hj.,.(0,27) for all r € R, 
that is, ~ € Hj.,(0,2m) for all r ¢ R. In particular, this implies that 7 is 
continuous and the transformed function $(y(t))=~(t)/|7+(#)| satisfies Symm’s 
equation (3.56) for f = 0. The application of Theorem 3.15 yields ¢=0. 


We are now in a position to apply all of the Galerkin methods of the previous 
section to Symm’s equation (3.56), that is, Kw = g in L?(0, 27). 

We have seen that the convergence results require estimates of the condition 
numbers of K on the finite-dimensional spaces X,, and also approximation prop- 
erties of P,W. In Lemma A.45 of Appendix A, we show the following estimates 
for any r > s. 


x 


llvnllar << en™ 8 |\dz|las for all dn € Xn, (3.67a) 


per per 


Pub — lla. S ——; llvllaz.., for all ny € Hper(0,2m), (3.67b) 


per 


and all n € N. Estimate (3.67a) is sometimes called the stability property (see 
[142]). From (3.67a) and the continuity of K—* from L7(0, 27) into H7.7.(0, 27), 
we conclude that 


llvnllz2 < cn||Kv,||z2 for all db, € Xp. (3.67c) 
Indeed, this follows from 


Ienlln2  < cn |Ynll yas. = en||K~* Kv nll aot 
en ecc2 way KYallee- 


IA 


Combining these estimates with the convergence results of the previous sec- 
tion, we have (almost) shown the following?. 


Theorem 3.18 Let * € H7.,.(0,27) be the unique solution of (3.57), that is, 


20 


(Ku\() == = fv") ml -A)las = 9" = FO). 


0 


for t € [0,27] and some g* € H™+1(0,2r) for some r > 0. Let g® € L?(0, 2m) 


per 


with ||g° — g*||L2 <6 and let X,, be defined by (3.62). 


‘Note that in this theorem, ~* denotes the exact solution in accordance with the notation 
of the general theory. It must not be mixed up with the complex conjugate. 
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(a) Let w° € Xp be the least squares solution, that is, the solution of 
(KWo, Kén)r2 = (9°,Kobn)r2 for all bn € Xn (3.68a) 


or 


(b) Let yo = ky with ps € X,, be the dual least squares solution, that is, 
w> solves 


(Ks, Kbn)22 = (9°,¢n)z2 for all dn € Xp (3.68b) 
or 
(c) Let 3 € Xp be the Bubnov-Galerkin solution, that is, the solution of 
(Kv5,dn)r2 = (9°,¢n)z2 for all dn € Xn. (3.68¢) 
Then there exists c > 0 with 


1 
Ih "hae < o(nd + "la. (3.69) 


for alin EN. 


Proof: We apply Theorems 3.11, 3.12, and 3.14 (the latter with V = Hpd?(0, 2m) 
and V’ = Hper!*(0,2m)). 

For the least squares method (Theorem 3.11), we have to show Assumption 
(3.33). By (3.67c) we have 


On = max{||dn|lz2 2 bn € Xn, |Kenllz2 = 1} 


IA 


cn, (3.70) 
and thus, using (3.67b) for r = 0 and s = —1, 


nin {Il dnllze + onll KC — dn)llz2} 


Sb — Pavllez + onl KY — Pr) Ile 
S |ldllnz + er |Klec72 pall) — Pavllacs < ellvlize- 


per 


x 


Furthermore, 
: * * * c * 
min{||y* — dnllz2 : dn € Xn} < |lb*— Pav" llze < pele lite 


which proves the result for the least squares method. 

The estimate for the dual least squares method follows immediately from 
(3.40) and on < cn. 

For the Bubnov—Galerkin method, we have to estimate p, from (3.48). 


pn = max{||¢nlla2 2 bn € Xn» llbnll yoare =1} < eva 
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by (3.67a) for r = 0 and s = —1/2. This ends the proof. 


It is interesting to note that different error estimates hold for discrete per- 
turbations of the right-hand side. Let us denote by 4; the right-hand sides of 
(3.68a), (3.68b), or (3.68c), respectively, for ¢,(t) = exp(ikt). Assume that @ is 
perturbed by a vector 3° € C?"+! with |G— 6°| < 6. We have to compute b,, of 
(3.35), (3.41), and (3.52), respectively. Because the functions W(t) = exp(ikt), 
k = —n,...,n, are orthogonal, we compute b,, for (3.41) and (3.52) by 


> Iesl?: 3 pits 


j=r-n j=—n 


For the least squares method, however, we have to compute 


> pj) Kov;|) =1 
j=r—n L? 
= max) >) lol? 2a(IooP? +o Tal a) = 


j=—n 540" 


n 
B= max} So |ajls 


j=—n 


that is, for discrete perturbations of the right-hand side, the estimate (3.69) is 
asymptotically the same for the dual least squares method and the Bubnov— 
Galerkin method, while for the least squares method it has to be replaced by 


I? — oll < (ms + 2 tsi Isa) (3.71) 


The error estimates (3.69) are optimal under the a priori information ~* € 
Ayer (0, 27) and ||)*||ar,, < 1. This is seen by choosing n ~ (1/6) HEREIN which 
gives the asymptotic estimate 
6 * 
ence) — Hl 


This is optimal by Problem 3.4. 
From the preceding analysis, it is clear that the convergence property 


, < esr), 


ie 
min { |js* — dnilae.,. : bn € Xn} < (2) Ile" lle.» 0 € Hper-(0, 2) , 
and the stability property 
Ilr ll are 


per 


< en™ *\[onllas,, bn € Xn, 


per? 


for r > s and n €N, are the essential tools in the proofs. For regions 2 with 
nonsmooth boundaries, finite element spaces for X,, are more suitable. They 
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satisfy these conditions for a certain range of values of r and s (depending on 
the smoothness of the solution and the order of the finite elements). We refer 
to [65, 68, 141-143, 268] for more details and boundary value problems for more 
complicated partial differential equations. 

We refer to Problem 3.5 and Section 3.5, where the Galerkin methods are 
explicitly compared for special cases of Symm’s equation. 

For further literature on Symm’s and related integral equations, we refer to 
[11, 12, 22, 83, 223, 249, 275]. 


3.4 Collocation Methods 


We have seen that collocation methods are subsumed under the general theory 
of projection methods through the use of interpolation operators. This requires 
the space Y to be a reproducing kernel Hilbert space, that is, a Hilbert space 
in which all the evaluation functionals y > y(t) for y € Y and t € [a,b] are 
bounded. 

Instead of presenting a general theory as in [202], we avoid the explicit intro- 
duction of reproducing kernel Hilbert spaces and investigate only two special, 
but important, cases in detail. First, we study the minimum norm collocation 
method. It turns out that this is a special case of a least squares method and 
can be treated by the methods of the previous section. In Subsection 3.4.2, we 
investigate a second collocation method for the important example of Symm’s 
equation. We derive a complete and satisfactory error analysis for two choices 
of ansatz functions. 

First, we formulate the general collocation method again and derive an error 
estimate in the presence of discrete perturbations of the right-hand side. 

Let X be a Hilbert space over the field K, X, C X be finite-dimensional 
subspaces with dimX, =n, anda < ty <--- < tn < bbe the collocation points. 
Let K : X — C[a,b] be bounded and one-to-one. Let Ka* = y*, and assume 
that the collocation equations 


(Kan)(ti) = y(ti), t=1,...,n, (3.72) 
are uniquely solvable in X,, for every right-hand side. Choosing a basis {4; : j = 
1,...,n} of X,, we rewrite this as a system Aa = 6, where x, = i a; 5 
and 

Ai; = (K%5)(ti), Bi = y(ti) - (3.73) 


The following main theorem is the analogue of Theorem 3.10 for collocation 
methods. We restrict ourselves to the important case of discrete perturbations 
of the right-hand side. Continuous perturbations could also be handled but are 
not of particular interest because point evaluation is no longer possible when 
the right-hand side is perturbed in the L?-sense. This would require stronger 
norms in the range space and leads to the concept of reproducing kernel Hilbert 
spaces (see [168]). 
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Theorem 3.19 Let Ka* = y* and let £4), 0.2} C [a,b], n € N, be a 
sequence of collocation points. Assume that Uncen Xn is dense in X and that 
the collocation method converges. Let x = 4 ae %; € Xn, where a® solves 
Aa’ = 6°. Here, 6° € K” satisfies |G — B°| < 6 where B; = y*(t;) and | - | 
denotes again the Euclidean norm in K”". Then the following error estimate 
holds: 


An 


6 
leh — ale < e(S 


6 + inf{||2* — znl|x zn € Xa}) ; (3.74) 


where 


an = max{ 


Yeti «doll? =1} (3.75) 
j=l Xx j=l 


and A, denotes the smallest singular value of A. 


Proof: Again we write ||z° — a*||x < ||z° — an|lx + |lan — 2*||x, where 
In = Rny* solves the collocation equation for 3 instead of 3°. The second term 
is estimated by Theorem 3.7. We estimate the first term by 


lt, —anllx < anja? —a| = a,|A71(8° — 8)| 
_ a 
< a,|A "\2|8° — Bl < a 
n 
Again we remark that a, = lif {@; : 7 =1,...,n} forms an orthonormal system 


in X. 


3.4.1 Minimum Norm Collocation 


Again, let K : X — C[a,6] be a linear, bounded, and injective operator from 
the Hilbert space X into the space C/a,}] of continuous functions on |[a, }]. 
We assume that there exists a unique solution «* € X of Ka* = y*. Let 
ast) <--: <t, < b be the set of collocation points. Solving the equations 
(3.72) in X is certainly not enough to specify the solution x, uniquely. An 
obvious choice is to determine x, € X from the set of solutions of (3.72) that 
has a minimal L?-norm among all solutions. 


Definition 3.20 x, € X is called the moment solution of (3.72) with respect 
to the collocation points a < ty <-+++<tyn <b if Xn satisfies (3.72) and 


Zn |lx = min{||zn||x : zn € X satisfies (3.72)}. 


We can interpret this moment solution as a least squares solution. Because 
z +> (Kz)(t;) is bounded from X into K, Theorem A.23 by Riesz yields the 
existence of k; € X with (Kz)(t;) = (z,ki)x for all ze X andi=1,...,n. 


If, for example, X = L?(a,b) and K is the integral operator 


b 
(Kz)(t) = [ kltss)2(s)as, te |a,b], ze I7(a,b), 
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with real-valued kernel k then k; € L?(a, b) is explicitly given by k;(s) = k(ti, s). 


Going back to the general case, we rewrite the moment equation (3.72) in the 
form 
(tn, ki) x = y (ts) = (2, ki) x, $= 1, ee aytts 

The minimum norm solution 2, of the set of equations is characterized by the 
projection theorem (see Theorem A.13 of Appendix A.1) and is given by the 
solution of (3.72) in the space X,, := span{k; :j =1,...,n}. 

Now we define the Hilbert space Y by Y := K(X) = R(K) with inner 
product 


(y,z)y = (K~1y, K7'z)x for y,z € K(X) = R(K). 
We omit the simple proof of the following lemma. 


Lemma 3.21 Y is a Hilbert space that is continuously embedded in C{a, db}. 
Furthermore, K is an isomorphism from X onto Y. 


Now we can rewrite (3.72) in the form. 
(At ER go = ty KR), 4S desig: 


Comparing this equation with (3.31a), we observe that (3.72) is the Galerkin 
equation for the least squares method with respect to X,,. Thus we have shown 
that the moment solution can be interpreted as the least squares solution for 
the operator K : X — Y. The application of Theorem 3.11 yields the following 
theorem. 


Theorem 3.22 Let K be one-to-one and {kj : j =1,...,n} be linearly indepen- 
dent where k; © X are such that (Kz)(t;) = (z,kj)x for allz € X,j=1,...,n. 
Then there exists one and only one moment solution t» of (3.72). Xp is given 
by 


te = oy hy, (3.76) 
j=l 
where a € K” solves the linear system Aa = 2 with 


Let (4, soy my Cc [a,b], n EN, be a sequence of collocation points such that 
Unen Xn is dense in X where 


X, i= span{k{”) = Levess ths 


Then the moment method converges; that is, the moment solution tpn € Xy 
of (3.72) converges in X to the solution a* € X of Ka* = y*. If «3 = 
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ie as ie where a € K” solves Aa® = 8° with |B — B°| < 6, then the 


following error estimate holds: 


\lc* —2\lx < 5 + emin{|lx* — zallx : mm € Xn}, (3.78) 


where 


an = max{ 


oik|| = doles =a} (3.79) 
j=l & j=1 
and where X,, denotes the smallest singular value of A. 


Proof: The definition of || - ||y implies that a, = 1, where o,, is given by 
(3.32). Assumption (3.33) for the convergence of the least squares method is 
obviously satisfied because 


amin {|le* — 2nllx + onl| (2 — zn ily} S lle" lx + onlla* |x = 2II2*|lx. 


The application of Theorem 3.11 yields the assertion. 


As an example, we again consider numerical differentiation. 


Example 3.23 
Let X = L?(0,1) and K be defined by 


t 1 
(Ka)(t) = fo) ds = ees) x(s)ds, te [0,1], 
0 0 
. fi, s<t, 
with k(t, s) = { | 
We choose equidistant nodes, that is, t; = i for 7 = 0,...,n. The moment 


method is to minimize ||z||7,. under the restrictions that 


[roe = y(t;), j=Hl,...,n. (3.80) 


The solution x, is piecewise constant because it is a linear combination of the 
piecewise constant functions k(t;,-). Therefore, the finite-dimensional space X;, 
is given by 


Xy = {a € 170, 1) tale ) constant, j =1,...,n}. (3.81) 


gristj 


As basis functions ; of X,, we choose @;(s) = k(t;,s). 
Then x, = pees a, k(t;,-) is the moment solution, where a solves Aa = 3 
with 6; = y(t;) and 


1 
1 
Ay = [ktt.s) k(t;,s)ds = 7 min{i, 7}. 
0 
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It is not difficult to see that the moment solution is just the one-sided difference 
quotient 


In(t1) = - ult); En(tj) = 5 [w(ts) — y(éj-1)], J =2,--.,7, 


for h = 1/n. 

We have to check the assumptions of Theorem 3.22. First, K is one-to-one 
and {k(t;,:) : 7 = 1...,n} are linearly independent. The union U,cy Xn is 
dense in L?(0,1) (see Problem 3.6). We have to estimate a, from (3.79), the 
smallest eigenvalue A,, of A, and min{||z — zp||12 : zn € Xn}. 

Let p € R” with yi 0; = 1. Using the Cauchy—Schwarz inequality, we 
estimate 


1 n 
/ S> pik(ty, 8) 
9 j=l 
Thus a, < \/(n+1)/2. 
It is straightforward to check that the inverse of A is given by the tridiagonal 
matrix 


2 lin n 7 
n 
ds < 5 k(t; 2d = y= ; 
ap» (t;,s)"ds j 5 
\ j= 


2 -1 
-1 2 -1 
At=n i. 
-1 2 -1 
—1 1 
We estimate the largest eigenvalue Umax of A~! by the maximum absolute row 


suM Umax < 4n. This is asymptotically sharp because we can give a lower 
estimate of f1max by the trace formula 


Nimax > trace(A~') = (47), = (2n—1)n; 
j=l 
that is, we have an estimate of A, of the form 
1 1 
— <r, < ; 
4n ~ I2n-1 


In Problem 3.6, it is shown that 


; 1 
min{ ||z — 2n||r2 : zn € Xn} < = Ile" ze - 


Thus we have proven the following theorem. 
Theorem 3.24 The moment method for (3.80) converges. The following error 


estimate holds: 


1 
le" -a8le < /" 


c * 
5 + —|I(a*)'lIz2 
mr 
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if x* € H'(0,1). Here, 6 is the discrete error on the right-hand side, that is, 


Dyer 182 —y* GF? < 6? and af = 57, 08 2;, where a® € R” solves Ao® = 8°. 


The choice X,, = Si(t1,...,tn) of linear splines leads to the two-sided difference 
quotient (see Problem 3.8). We refer to [85, 201, 202] for further reading on 
moment collocation. 


3.4.2 Collocation of Symm/’s Equation 
We now study the numerical treatment of Symm’s equation (3.57), that is, 


27 


(KU\) = 2 f¥()mhH}-r)lds = g®) —B-82) 


0 


for 0 <t < 2m by collocation methods. The integral operator K from (3.82) is 
well-defined and bounded from L*(0, 27) into H}.,.(0, 2m). We assume through- 
out this subsection that K is one-to-one (see Theorem 3.15). Then we have seen 
in Theorem 3.17 that equation (3.82) is uniquely solvable in L?(0, 27) for every 

€ H}.,(0, 2m); that is, K is an isomorphism. We define equidistant collocation 
points by 


tk =p fork =0,...,2n—1. 
n 


There are several choices for the space X, C L?(0, 27) of basis functions. Before 
we study particular cases, let X,, = span{%; : 7 € Jn} C L?(0,27) be arbitrary. 
Jy, C Z denotes a set of indices with 2n elements. We assume that {%; : 7 € Jn} 
forms an orthonormal system in L?(0, 27). 
The collocation equations (3.72) take the form 
20 


== f als) In |y(t,) — y(s)|ds = g* (th), K=0,...,2n—1, (3.83) 


T 
0 


with Yn € Xn. Let Qn : Ape,(0,27) + Y, be the trigonometric interpolation 
operator into the 2n-dimensional space 


n-1 
Y, i= S- one am ech, (3.84) 


ma=—-n 


We recall some approximation properties of the interpolation operator Qp : 
Ay er(0,27) —* Yn. First, it is easily checked (see also Theorem A.46 of 
Appendix A.4) that Q,, is given by 


2n—1 


Qnd = S> Vlte) de 


k=0 
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with Lagrange interpolation basis functions 
1 
i) = = >) EPO RSD a BM 1s (3.85) 


From Theorem A.46 of Appendix A.4, we have also the estimates 


6 
IIb-Qndllcz S 7 [ellen for all de ee 27),  (3.86a) 
QnVlla., < cll¥lla,, for all y € Hp.,(0, 27). (3.86b) 


per 


Now we can reformulate the collocation equations (3.83) as 
QnKkYn = Qng with Yn € Xn. (3.87) 


We use the perturbation result of Theorem 3.8 again and split K into the form 
K= Ko + C with 


(ea = _ fv in (: sin? =) = | ds. (3.88) 
0 


Now we specify the spaces X,,. As a first example, we choose the orthonormal 


functions 
1 


a ee oe 


We prove the following convergence result. 


ijt 


e€ 


for j =—n,...,n—1. (3.89) 


Theorem 3.25 Let @;, 7 = —n,...,n—1, be given by (3.89). The collocation 
method is convergent; that is, the solution tp € Xn of (3.83) converges to the 
solution w* € L?(0,27) of (3.82) in L?(0,2r). 

Let the right-hand side of (3.83) be replaced by 3° € C?” with 


2n-1 


d, [82 — 9" (th)? < 6. 


Let a® € C?” be the solution of Aa® = 6°, where Ay; = (K&;)(te). Then the 
following error estimate holds: 


IIb, — W"llze < e[V¥nd + min{||y* — dnllc2 ton © Xn}], (3.90) 


where 


w(t) = Wor => a’ est | 
Ws 


jge=-n 


If y* € Hy.-(0, 27) for some r > 0, then 


Ws — vl < c[vne + Ae 5.0} (3.91) 
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Proof: By the perturbation Theorem 3.8, it is sufficient to prove the result 
for Ko instead of K. By (3.66a) and (3.66b), the operator Ko maps X,, into 
Y, = Xn. Therefore, the collocation equation (3.87) for Ko reduces to 


Kotn = Qng" . 
We want to apply Theorem 3.7 and have to estimate R, Ko where in this case 
Rn = (Kolx,) Qn. Because Ko : 12(0,27) —> H,,.(0,2m) is invertible, we 
conclude that 
|Rngllz2 = vnllez < c1l|Kodn|| a 


per 


= |[Qnglla,. < callgllaz., 


per 


for all g € H}.,(0,27), and thus 


|RnKY|lo2 < cll las 


= per 


< ¢s|lvl|z2 


for all w € L?(0,27). The application of Theorem 3.7 yields convergence. 
To prove the error estimate (3.90), we want to apply Theorem 3.19 and 
hence have to estimate the singular values of the matrix B defined by 


Bry = (Ko%;)(tk), k,j=—n,....n—-1, 
with @; from (3.89). From (3.66a) and (3.66b), we observe that 


1 ol he 
= ——— een, k,j=—n,...,n—-1, 
V20 |i 
where 1/|j| has to be replaced by 1 if 7 = 0. Because the singular values of B 
are the square roots of the eigenvalues of B* B, we compute 


Bx; 


n—1 n—1 

a) 1 1 ac py nil 
B‘B),, = \\ BueBy = —oa moe _ M1 ys 
(BBs = Qe BaP = oe Tal 2° 7 Boe 


where again 1/¢? has to be replaced by 1 for 4 = 0. From this, we see that 
the singular values of B are given by ,/n/(€?) for €=1,...,n. The smallest 
singular value is 1/./nz. Estimate (3.74) of Theorem 3.19 yields the assertion. 
(3.91) follows from Theorem A.46. 


Comparing the estimate (3.91) with the corresponding error estimate (3.69) 
for the Galerkin methods, it seems that the estimate for the collocation method 
is better because the error 6 is only multiplied by \/n instead of n. Let us now 
compare the errors of the continuous perturbation ||y* — y°||,2 with the discrete 
perturbation for both methods. To do this, we have to “extend” the discrete 
vector 3° to a function y> € X,. For the collocation method, we have to use 
the interpolation operator Q,, and define yd € X, by yo — Sant 6°55, where 
gj are the Lagrange basis functions (3.85). Then y°(t,) = 2, and we estimate 


lIyn — 9" llze S lyn — Qny*lln2 + WQny* —y" lize - 
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Writing 


n-1 
h(t) — Qny*() = >> pe, 


j=—n 
a simple computation shows that 

n—-1 n—-1 n—-1,; n-1 2 
Beye = SOE - ny" (te)? = SO) SO py eh 
k=0 k=0 k=0!j=—n 

n—-1 

6 
= 2n D> loyl? = “ly-Qny"lie. (8.92) 


jaan 


Therefore, for the collocation method we have to compare the continuous error 
6 with the discrete error 6 \/n/z. This gives an extra factor of ,/n in the first 
terms of (3.90) and (3.91). ; 

For Galerkin methods, however, we define y°(t) = se Sen pe exp(zjt). 


Then (y°, ev") 2 = pe. Let P,, be the orthogonal projection onto X,. In 


lyn — 9" lle << lyn — Poy" lice + [Pay* — 9" llze, 


we estimate the first term as 


1, <= : a 
Ily> — aalle = on + |? - (y*,e7") 0° 


j=—n 


In this case, the continuous and discrete errors are of the same order. 


Choosing trigonometric polynomials as basis functions is particularly suit- 
able for smooth boundary data. If OO or the right-hand side f of the boundary 
value problem (3.54) is not smooth, then spaces of piecewise constant func- 
tions are more appropriate. We now study the case where the basis functions 
&; € L*(0, 27) are defined by 


; (3.93a) 
Da et or 


2n? 


Sy Me <e, Ot tS ns 


mR if |t-t,|< <4, 

f(t) = Vie - il < Gn (3.93b) 
0, at ote] > ey 

for j = 1,...,2n —1. Then @;, 7 = 0,...,2n — 1, are also orthonormal in 

L?(0,27). In the following lemma, we collect some approximation properties of 

the corresponding spaces X,,. 
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Lemma 3.26 Let X,, = span{%; : j = 
by (3.98a) and (3.93b). Let Py : L?(0,27 
operator. Then VU Xp, is dense in 12(0 


0,...,2n —1}, where &; are defined 
)—> + Xn be the orthogonal projection 


27) and there exists c > 0 with 


neN 
b— Pavllez << (Wllag,, for all v € Hye, (0.2m), (3.94a) 
KW —Pav)llz2 <= (Wllaa for ally € L?(0,2n). (3.94b) 


Proof: Estimate (3.94a) is left as an exercise. To prove estimate (3.94b), we 
use (implicitly) a duality argument: 


Pr 
IK()—PaWllz2 = sup (K(h — Pra), %) pa 
$40 ||| x2 
= sup () — Pr, K¢) 5. 
#40 ||| x2 
a (p, (I — P,)K¢) ,2 
o#0 lPllz2 
& Wilpecnpll = to ele 
#0 \|¢l| 22 
| el] x2 


er c 
= Se [laa - 
n 


< = |/P||z2 sup —— 2" 
n o40 = (I¢l| x2 


Before we prove a convergence theorem, we compute the singular values of the 
matrix B defined by 


By eee x fos Pre ) in (4 sin?! sin? at) aa ds. (3.95) 


Lemma 3.27 B is symmetric and positive definite. The singular values of B 
coincide with the eigenvalues and are given by 


_ n nm sin em 1 
i (2 ae Pe (2 2n0t ae (2 +4) a 


jEZ 
form=1,...,2n—1. Furthermore, there exists c > 0 with 
1 
= < [lm < evn for allm=0,...,2n—1. (3.96b) 


We observe that the condition number of B, that is, the ratio between the 
largest and smallest singular values, is again bounded by n. 
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Proof: We write 
ti+on 
1 /n .29S—tk 
Bry = ge ‘E fn (: sin? =<) | ds bj—k 
tj— on 
with 
tetany 
1 “f 
be = -— a [in (4 sin? =) -1 ds, 
2a \ 7 
te-—an 
where we extended the definition of t, to all € € Z. Therefore, B is circulant 
and symmetric. The eigenvectors x”) and eigenvalues jum of B are given by 
. 7 2n—-1 
20) m= (eA) and tan = So bee, 
k=0 
respectively, for m = 0,...,2n — 1, as is easily checked. We write y,, in the 
form 
20 
1 - 28 
Lm = —z— | Um(s) [im (4 sin =) — 1 ds = Kow,(0) 
27 2 
0 


with 


[zemts for bie. keZ. 
T 2n 
Let Wm(t) = Opez Pm,k exp(ikt). Then by (3.66a) and (3.66b), we have 


sk 
Lm = Pm + a 
kX0 


Therefore, we have to compute the Fourier coefficients p,,,, of Wm. They are 
given by 


27 l IMn-1 tyton 
os —iks nm aoe imje —iks 
Pmk = 5 J vmis)e d ee ae e "ds 
j=0 t;—-= 
a 2n 
For k = 0, this reduces to 
pmo = feb Ss emit = { Vale itm=o, 
en nm 2n = 0 ifm=1,...,2n—1, 
j= 
and for k £ 0 to 
ie 2 k 
(pak > i(m—k)j™ | sin, ifm —k € 2nZ, 
Pm,k — — ———_ e€ a TF: nr 
na wk = 


ifm —k € 2nZ. 
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Thus we have uo = \/n/m and 


2n 3 | sin 5 al [eee 1 
Weg Ke is vat jez (B+3)" 


k—me€2nZ 


form =1,...,2n—1. This proves (3.96a). Because all eigenvalues are positive, 
the matrix B is positive definite and the eigenvalues coincide with the singular 
values. We set x = m/(2n) € (0,1) and separate the first two terms in the 
series. This yields 


n sintz [1 1 
mm t 3.97 
. Ve 2n7 (= (a — 7) ee) 
n sinnz — 1 nsinnz — 1 
mw Int 3S (a+ 7)? 7 nm Qn0 a (a — 7)? 


j= j=2 


n 1 (singe | sina(1—2z) 
a 2n0 a | (1-2)? 


From the elementary estimate 


sin 72x sin 7(1 — x) 
= 7 wl ? 
= (i-sz = 8, «x € (0,1) 


we conclude that 
4 1 1 


> > 
pan Nae TJf/7™n /7n 
for m = 1,...,2n —1. The upper estimate of (3.96b) is proven analogously. 
Indeed, from (3.97) we estimate, using also |sint/t| < 1, 


n il sin 72x sin 7(1 — x) 
m SS 2 
B ~ Vn 2n0 x2 ~ (1 — x)? > a 7? 
In 1 {1 if 2. | 
| = 
mt 2n\ a ¥ 1-« ae < evn 


jot 


for some c > 0. 


Now we can prove the following convergence result. 


Theorem 3.28 Let %;, 7 = 0,...,2n—1, be defined by (3.93a) and (3.93b). 
The collocation method is convergent fai is, the solution p € Xn of (3.83) 
converges to the solution ~* € L?(0,2m) of (3.82) in L7(0,27). 

Let the right-hand side be replaced by 3° € C?” with 


2n-1 


» 13 — 9° (t,)P < 8. 
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Let a® € C?” be the solution of Aa®> = 8°, where Ayj = K&;(ty). Then the 
following error estimate holds: 


Iw? — "|b Sec [Vn + min{ ||~* a On| x2 > gn € Xa }| , (3.98) 


where > = pare afé;. If y* € Hp.,(0,21), then 


1 * 
lIvn— VF lle S efvnd + — |v" lla... (3.99) 


Proof: By the perturbation theorem (Theorem 3.8), it is sufficient to prove 
the result for Kg instead of K. Again set 


Rn = [QnKolx,] Qn: H}..(0,2") —+ Xn C L?(0,2n), 


let » € Hy.,(0,27), and set Yn = Rnb = aie "5 aj#;. Then a € C2” solves 
Ba = 8 with G, = w(te), and thus by (3.96b) 


IIvnllz2 = lal < |B IB] < Van 


Qn-1 1/2 
S- co 
k=0 


where |-| again denotes the Euclidean norm in C”. Using this estimate and 
(3.92) for 3° =0, we conclude that 
|Rnv|lcz = [lenlc2 < r[lQn%llze (3.100) 


for all ~ € H}.,(0,27). Thus 


|RnKod|lz2 < n||QnKov||x2 


for all  € L?(0,2m). Now we estimate ||RnKow||r2 by the L?-norm of itself. 
Let Yn = Pry € Xn be the orthogonal projection of w € I7(0, 27) in Xp. 
Then Ry Kon = Yn and ||Wn||z2 < ||||z2, and thus 
| Rn Kow = Pnllz2 || Rn Kol = Un) ie <n |QnKo(w = dn) es 
n|lQnKow — Kodllza + n||Kod — Kodnllze 
+n |Kovn QnKovn||L2 : 


Now we use the error estimates (3.86a), (3.94a), and (3.94b) of Lemma 3.26. 
This yields 


IA 


A 


|RnKow —Vnllnz <a [Kowa + |ldllz2 + Kodnllaa 


per per 


ca [Ilvlez + WGnllze] < eallvllee, 


IA 


that is, ||RnKow||z2 < ca||||z2 for all » € L7(0,27). Therefore, the assump- 
tions of Theorem 3.7 are satisfied. The application of Theorem 3.19 yields the 
error estimate (3.99). 
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Among the extensive literature on collocation methods for Symm’s integral 
equation and related equations, we mention only the work of [8, 66, 67, 140, 147, 
241, 242]. Symm’s equation has also been numerically treated by quadrature 
methods; see [90, 169, 239, 240, 250, 251]. For more general problems, we refer 
to [9, 68}. 


3.5 Numerical Experiments for Symm’s 
Equation 


In this section, we apply all of the previously investigated regularization strate- 
gies to Symm’s integral equation 
27 


(KY)() = -= | 46) In|y(t)— y(s)|ds = gt), OSt<2z, 


0 


where in this example y(s) = (cos s,2sin s), 0 <s < 2m, denotes the parametriza- 
tion of the ellipse with semiaxes 1 and 2. First, we discuss the numerical com- 
putation of kw. We write Kw in the form (see (3.60)) 


27 20 
1 t— 
(Kw)(t) = -5 |) In(4sin? —*) ds + [vo k(t, s) ds, 
T 
0 0 
for 0 <t < 27, with the analytic function 
1 | byt) - (8)? 
k = I t 
(t, s) on nl Asin? is ’ 5, 
1 


We use the trapezoidal rule for periodic functions (see [168]). Let t; = j*, 
j =0,...,2n—1. The smooth part is approximated by 


an 2n—1 
ic s) (s)ds & “ S- k(t,t;) vty), OS t< 2m. 
0 j=0 


For the weakly singular part, we replace w by its trigonometric interpolation 
polynomial Q,~) = Yo w(t;) L; into the 2n-dimensional space 


n-1 


Soa; cos(jt) + S- b; sin(jt) : ay, b; ER 


j=0 j=l 
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over R (see Section A.4 of Appendix A.4). From (A.37) and Lemma 3.16, we 
conclude that 


20 2 
- 1 
ae = 4.2 
-= | ¥) In(4sin ) as = (Qnib)(s) In(4sin ) ds 
0 0 
2n-1 
= > vt) RO, O<t<2r, 
j=0 
where 
1 20 
i, oe ee? 
R(t) = be L;(s) In(4sin 5 ) ds 
0 
1fi = 1 
=. {i cosn(t—t;) + bs = cos m(t — »} 


for 7 =0,...,2n—1. Therefore, the operator K is replaced by 


2n-1 


(Knd)(t) = >> dt) [Rj + “ k(t,t,)]) O<t<2r. 
j=0 


It is well-known (see [168]) that K,w converges uniformly to Kw for every 
2n-periodic continuous function 7. Furthermore, the error ||Knw — KWo is 
exponentially decreasing for analytic functions w. For t = t,, k =0,...,2n—1, 
we have (Knw)(t,) = eg Ax; w(t;) with the symmetric matrix 


Ak; = Ry 5| + ” k(t, ts), k,j=0...,2n—1, 


Tl 1 l n—1 1 
where Re = ! S- conte £=0,...,2n—1. 
m 


n 


For the numerical example, we take w(s) = exp (3 sin s), 0 < s < 27, and 
g = Kw or, discretized, 0; — exp(3 sin t;), j=0,...,2n—1, and g= Ay). We 
take n = 60 and add uniformly distributed random noise on the data g. All the 
results show the average of 10 computations. The errors are measured in the 
discrete norm |z|3 := 3 sare |z;|?, 2€C™. 

First, we consider Tikhonov’s regularization method for 6 = 0.1, 6 = 0.01, 
5 = 0.001, and 5 = 0. In Figure 3.1, we plot the errors |? — ~|, and 
| Aye? — g|, in the solution and the right-hand side, respectively, versus the 
regularization parameter a. 

We clearly observe the expected behavior of the errors: For 6 > 0 the error 
in the solution has a well-defined minimum that depends on 6, while the defect 
always converges to zero as a tends to zero. 
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The minimal values errs of the errors in the solution are approximately 
0.351, 0.0909, and 0.0206 for 6 = 0.1, 0.01, and 0.001, respectively. From 
this, we observe the order of convergence: increasing the error by factor 10 
should increase the error by factor 10?/° = 4.64, which roughly agrees with the 
numerical results where errs—o.i/errs—o.o1 © 3.86 and errs—o.o1/errs—o.001 © 
4.41. 

In Figure 3.2, we show the results for the Landweber iteration with a = 0.5 
for the same example where again 6 = 0.1, 6 = 0.01, 6 = 0.001, and 6 = 0. 
The errors in the solution and the defects are now plotted versus the iteration 
number m. 


0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 % 0.005 0.01 0.015 0.02 0.025 0.03 0.035, 0.04 


% 0.0002 0.0006 0.001 0.0014 0.0018 0 0.0002 0.0004 0.0006 0.0008 0.001 


Figure 3.1: Error for Tikhonov’s regularization method. 
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Figure 3.3: Error for the conjugate gradient method. 
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In Figure 3.3, we show the results for the conjugate gradient method for the 
same example where again 6 = 0.1, 6 = 0.01, 6 = 0.001, and 6 = 0. The errors 
in the solution and the defects are again plotted versus the iteration number m. 

Here, we observe the same behavior as for Tikhonov’s method. We note the 
difference in the results for the Landweber method and the conjugate gradient 
method. The latter decreases the errors very quickly but is very sensitive to 
the exact stopping rule, while the Landweber iteration is slow but very stable 
with respect to the stopping parameter 7. The minimal values are errsso.1 © 
0.177, errs—o.o1 © 0.0352, and errs o.o01 © 0.0054 for the Landweber iteration 
and errs—o.1 * 0.172, errs—o.o1 © 0.0266, and errs—o.o91 * 0.0038 for the 
conjugate gradient method. The corresponding factors are considerably larger 
than 10?/3 ~ 4.64 indicating the optimality of these methods also for smooth 
solutions (see the remarks following Theorem 2.15). 


Table 3.1: Least squares method 


n 6=01 6=001 6=0.001 6=0 

1 38.190 38.190 38.190 38.190 

2 15.772 15.769 15.768 15.768 

3 5.2791 5.2514 5.2511 5.2511 

4 1.6209 1.4562 1.4541 1.4541 

5 1.0365 0.3551 3.433 * 107! 3.432 «107! 
6 1.1954 0.1571 7.190 * 10-2 7.045 « 10-2 
10 2.7944 0.2358 2.742 * 10-2 4.075 « 1075 
12 3.7602 0.3561 3.187 * 1072 5.713 «1077 
15 4.9815 0.4871 4.977 * 10-2 5.570 « 10719 
20 7.4111 0.7270 7.300* 1072 3.530 * 107!” 


Table 3.2: Bubnov—Galerkin method 


n 6=01 6=0.01 6=0.001 6=0 

1 38.190 38.190 38.190 38.190 

2 15.771 15.769 15.768 15.768 

3 5.2752 5.2514 5.2511 5.2511 

4 1.6868 1.4565 1.4541 1.4541 

5 1.1467 0.3580 3.434% 107! 3.432 «107! 

6 1.2516 0.1493 7.168 *1072 7.045 * 107? 
10 2.6849 0.2481 2.881 * 10-2 4.075 * 1075 
12 3.3431 0.3642 3.652 * 1072 5.713 «1077 
15 4.9549 0.4333 5.719 * 10-2 5.570 « 10-19 
20 7.8845 0.7512 7.452 *1072 3.519 * 10-12 


Next, we compute the same example using some projection methods. First, 
we list the results for the least squares method and the Bubnov—Galerkin method 
of Subsections 3.2.1 and 3.2.3 in Tables 3.1 and 3.2. We observe that both 
methods produce almost the same results, which reflect the estimates of Theo- 
rem 3.18. Note that for 6 = 0 the error decreases exponentially with m. This 
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reflects the fact that the best approximation min{||~ — ¢nlr2 : € Xn} 
converges to zero exponentially due to the analyticity of the ae iG y= 
exp(3 sin s) (see [168], Theorem 11.7). 


Now we turn to the collocation methods of Section 3.4. To implement the 
collocation method (3.83) for Symm’s integral equation and the basis functions 
(3.89), (3.93a), and (3.93b), we have to compute the integrals 


20 
1 
-= fe I$ In |y(tz) — y(s)| ds, (3.101a) 
0 


j=—m,...,m—1,k=0,...,2m-—1, and 


1 
== f a(6) In|y(th) — y(s)|ds, j,k =0,...,2m—1, (3.101b) 
7 


o 


respectively. For the first integral (3.10la), we write using (3.64), 


20 


1 
== fe injntte) ~a1e)las 
0 
20 


27 
1 th — 1 es tn) = 2 
= -s fe igs In(4 sin? i =) ds pee In [1{te) = 109)" x) — 1) ds 
20 2 1 


where €; = 0 for 7 = 0 and e; = 1/|j| otherwise. The remaining integral is 
computed by the trapezoidal rule. 

The computation of (3.101b) is more complicated. By Definition (3.93a), 
(3.93b) of ;, we have to calculate 


t;+7/(2m) nm /(2m) 
In |y(ty) — (8) ds = ) In |y(tx) — os + t,)P as. 


t;-—1/(2m) —n/(2m) 
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For 7 # k, the integrand is analytic, and we use Simpson’s rule 


n/(2m) 


g(s)ds ~ S\weg(se), 


—n/(2m) 


where 


mn 22m’ 


1 
Pn 1 a {4 


€=0,...,n. For 7 =k, the integral has a weak singularity at s = 0. We split 
the integrand into 


n/(2m) a/(2m) : esp 

In(4sin? =) ds Bi ia ly( b) — 3 + k)| d 

2 4 sin“ (s/2) 
—n/(2m) —x/(2m) 

S nm / (2m) : 

= =2 / In(4sin? =) ds + In ly(te) — y(s + te) | ds 
2 4sin(s/2) 
nm /(2m) —m/(2m) 


because In(4sin?(s/2)) is even and {¢ In(4sin?(s/2)) ds = 0 by (3.64). Both 
integrals are approximated by Simpson’s rule. For the same example as earlier, 
with 100 integration points for Simpson’s rule we obtain the following results 
for basis functions (3.89) (in Table 3.3) and basis functions (3.93a), (3.93b) (in 
Table 3.4). 


Table 3.3: Collocation method for basis functions (3.89) 


m d=0.1 d= 0.01 56 = 0.001 6=0 
1 6.7451 ~~ 6.7590 6.7573 6.7578 
2 1.4133 = 1.3877 1.3880 1.3879 
3 0.3556 2.791*107' 2.770*«107-' 2.769* 1071 
A 0.2525 5.979%10-" 5.752«10-* 6.758410 
5 0.3096 3.103*107? 1.110*107? 1.099 «10-2 
6 0.3404 3.486*107? 3.753*«1073 1.905 « 1073 
10 0.5600 5.782*107? 5.783*1073 6.885 « 1077 
12 0.6974 6.766*10-? 6.752* 1073 8.135 * 1079 
15 0.8017 8.371*«10-? 8.586*1073 6.436 *«10- 
20 1.1539 1i6s«10-" 118210" 1.806% 10-* 
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Table 3.4: Collocation method for basis functions (3.93a) and (3.93b) 
m 6=0.1 d= 0.01 6 = 0.001 6=0 


1 6.7461 6.7679 6.7626 6.7625 

2 1.8829 1.3562 1.3599 1.3600 

3 0.4944 4.874*107' 4.909*10-' 4.906 «107! 

4 0.3225 1.971*107' 2.000*10-! 2.004« 107! 

5 0.3373 1.640%10-" 1.615*10~ Lé617*«10-+* 

6 0.3516 1.341*107' 1.291*107-' 1.291% 1071 
10 0.5558 8.386*107? 6.140*«10-? 6.107 « 10~? 
12 0.6216 7.716*10-? 4.516*10-? 4.498 x 10~? 
15 0.8664 9.091*107? 3.137*107? 3.044 «107? 
20 1.0959 1.168*107-' 2.121*107-? 1.809 * 107? 
30. 1.7121 1.688*107-! 1.862*107? 8.669 * 10-3 


The difference for 6 = 0 reflects the fact that the best approximation 
min{ ||) — ¢,||L? : dn, € span {z; : 7 € Jt} 


converges to zero exponentially for £; defined by (3.89), while it converges to 
zero only of order 1/n for @; defined by (3.93a) and (3.93b) (see Theorem 3.28). 

We have seen in this section that the theoretical investigations of the regu- 
larization strategies are confirmed by the numerical results for Symm’s integral 
equation. 


3.6 The Backus—Gilbert Method 


In this section, we study a different numerical method for “solving” finite 
moment problems of the following type: 


b 
[eaoas = Ys Ga lyasigh. (3.102) 


a 


Here, y; € R are any given numbers and k; € L?(a,b) arbitrary given functions. 
Certainly, we have in mind that y; = y(t;) and kj = k(t;,-). In Section 3.4, we 
studied the moment solution of such problems; see [184, 229]. We saw that the 
moment solution x, is a finite linear combination of the functions {k1,..., kn}. 
Therefore, the moment solution x, is as smooth as the functions k; even if the 
true solution is smoother. 


The concept originally proposed by Backus and Gilbert ({13, 14]) does not 
primarily wish to solve the moment problem but rather wants to determine how 
well all possible models z can be recovered pointwise. 
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Define the finite-dimensional operator K : L?(a,b) > R” by 


b 
(Ka); = [eoatsras, $y pest, “eI (ab): (3.103) 


a 


We try to find a left inverse S, that is, a linear operator S : R" > L?(a,b) such 
that 
SKa = « for all x € L?(a,b). (3.104) 


Therefore, SAK2z should be a simultaneous approximation to all possible 7 € 
L?(a,b). Of course, we have to make clear the meaning of the approximation. 
The general form of a linear operator S : R" — L?(a,b) has to be 


) = yy y;(t) », te (a, b) » Y= (y;) € R® ’ (3.105) 


for some y; € L?(a,b) that are to be determined from the requirement (3.104): 


b 
(sKa)(t) = Sroj(t) i OUGOES 


The requirement Ska = x leads to the problem of approximating Dirac’s delta 
distribution 5(s — t) by linear combinations of the form et k;(s) yj (t). For 
example, one can show that the minimum of 


s) y(t) — 6(s — 0] ds dt 


(in the sense of Seni tons) is a nee at y(s) = A7'k(s), where k(s) = 
(ki(s),.--,kn(s yt and Aj; = ie ki(s) kj(s) ds, i,j = 1,...,n. For this min- 
iiation criterion, x = >.” j=1 VI¥5 8 a the moment ‘solnien of Subsec- 
tion 3.4.1. In [184], it is shown that minimizing with respect to an H,.)-norm 
for s > 1/2 leads to projection methods in H>,,-spaces. We refer also to [272] 
for a comparison of several minimization criteria. 

The Backus-Gilbert method is based on a pointwise minimization criterion: 
Treat t € le b] as a fixed parameter and determine the numbers y; = y,(t) for 
j =1,...,n, as the solution of the following minimization problem: 


6 n 2 
minimize / |s—<|? > k;(s) p;| ds (3.106a) 
j=l 
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subject to y € R” and 


[ose y;,ds = 1. (3.106b) 


q j=l 
Using the matrix-vector notation, we rewrite this problem in short form: 
minimize y'Q(t)y subject to r-y=1, 


where 


b 
Qt); = OLTOLe i,j=1,...,n, 
b 


i = [rteras, PS 1s 25h 

This is a quadratic minimization problem with one linear equality constraint. 
We assume that r 4 0 because otherwise the constraint (3.106b) cannot be 
satisfied. Uniqueness and existence are assured by the following theorem, which 
also gives a characterization by the Lagrange multiplier rule. 

Theorem 3.29 Assume that {ki,...,kn} are linearly independent. Then the 
symmetric matrix Q(t) € R"*” is positive definite for every t € [a,b]. The 
minimization problem (3.106a), (3.106b) is uniquely solvable. yp € R” is a 
solution of (3.106a) and (3.106b) if and only if there exists a number X © R 
(the Lagrange multiplier) such that (y, 4) € R” x R solves the linear system 


Q(t)hy-Ar = 0 and r-yp = 1. (3.107) 
A= yp! Q(t) v is the minimal value of this problem. 
Proof: From 


2 
ds, 


b n 
eQe = f\s—tP) mle, 

a j=l 
we conclude first that y' Q(t) ~ > 0 and second that y' Q(t) y = 0 implies 
that )7'_, kj(s)~; = 0 for almost all s € (a,b). Because {kj} are linearly 
independent, y; = 0 for all 7 follows. Therefore, Q(t) is positive definite. 
Existence, uniqueness, and equivalence to (3.107) are elementary results from 
optimization theory; see [269]. 


Definition 3.30 We denote by (yj (Fy € R” the unique solution py € R” of 
(3.106a) and (3.106b). The Backus—Gilbert solution x, of 


b 
[ ble) an(s) as = Yj jH=l1,....7, 


a 
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is defined as 


Ey(t) = > Yj y;(t) , tela, b| : (3.108) 


The minimal value \ = X(t) = y(t) ' Q(t)y(t) is called the spread. 
We remark that, in general, the Backus-Gilbert solution x, = D704 4; %; 


is not a solution of the moment problem, that is, i k;(s) &n(s)ds # y;! This 
is certainly a disadvantage. On the other hand, the solution x is analytic in 
[a, b|J—even for nonsmooth data k;. We can prove the following lemma. 


Lemma 3.31 y; and X are rational functions. More precisely, there exist poly- 
nomials p;,q € Pain—1) and p € Pan such that y; = p;/q, j = 1,...,n, and 
A= p/q. The polynomial q has no zeros in {a, b]. 


Proof: Obviously, Q(t) = Qo — 2t Qi + t? Q2 with symmetric matrices Qo, 
Q1, Q2. We search for a polynomial solution p € [Pm] " and p € Pm+2 of 
Q(t)p(t) — p(t)r = 0 with m = 2(n — 1). Because the number of equations is 
n(m + 3) = 2n? +n and the number of unknowns is n(m + 1) + (m+3) = 
2n? +n+ 1, there exists a nontrivial solution p € [Pm] ” and p € Pm+2. If 
p(t) = 0 for some t € [a,b], then p(t) = 0 because r 4 0. In this case, we divide 
the equation by (t — t). Therefore, we can assume that p has no zero in [a, b]. 

Now we define q(t) := r- p(t) for t € [a,b]. Then g € Pm has no zero in [a, }] 
because otherwise we would have 


0 = pli)r- pl) = vf)" QM r@; 


thus p(t) = 0, a contradiction. Therefore, y := p/q and  := p/q solves (3.107). 
By the uniqueness result, this is the only solution. 


For the following error estimates, we assume two kinds of a priori information 
on x depending on the norm of the desired error estimate. Let 


Xn = span{k;:j=1,...,n}. 


Theorem 3.32 Let x € L?(a,b) be any solution of the finite moment problem 
(3.102) and tn = et yj; be the Backus—Gilbert solution. Then the following 
error estimates hold: 


(a) Assume that x is Lipschitz continuous with constant € > 0, that is, 


|jx(t) —2(s)| < els—t| for all s,t € [a,d]. 


Then 
|v, (t) — a(t)| < €Vb-—ae,(t) (3.109) 
for alln EN, t € [a,b], where €n(t) is defined by 
b 


b 
e(t) := min{ f |s—t|? |zn(s)|?ds : zn € Xn, [) ds = We (3.110) 
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(b) Let x € H*'(a,b). Then there exists c > 0, independent of x, such that 
lan —2\|z2 < ella’||z2 llen|loo forallneN. (3.111) 


Proof: By the definition of the Backus—Gilbert solution and the constraint 
on vy, we have 


an(t)—a(t) = S yy e,(t) - xt) f > Ay(s) y;(t) ds 
= YF [b5(6) (els) - 200] vilt)as 


Thus 


Ien(t) — a(t)| < if ) 


Now we distinguish between parts (a) and (b): 
(a) Let |x(t) — x(s)| < é|t — s|. Then, by the Cauchy—Schwarz inequality and 
the definition of y,;, 


b n 
jen) — a0] < €f1-[Q a (s)2,(8] sles 
a j=1 
b n 2 ue 
< lvVb—a [we y;(t)} |t— s|? ds 
a j=l 
= lvVb—-ae,(t). 
(b) First, we define the cutoff function As on [a, b] x [a,b] by 
_ 1, t— s| 2 }, 
s(t, 8) = { 0 tal <é (3.112) 


Then, by the Cauchy—Schwarz inequality again, 


| 5 ky(s) ey(t) (t — s)| Ag(t, 8) ms) 210 Ee 
a J ; | 
sat? fasten SE] ae 
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Integration with respect to ¢ yields 


b 


[| Poses 
° ° b ob 
< ell ff 


The following technical lemma from the theory of Sobolev spaces yields the 
assertion. 


at) 200) Gs) cea 


Lemma 3.33 There exists c > 0 such that 
iI 
for alld > 0 anda € H'(a,b). Here, the cutoff function \5 is defined by (3.112). 
Proof: First, we estimate 
t 
fi @mrar 


ze) 2f8/ As(t,s)dsdt < ellx’||?. 


2 
< |t-s| 


|a(s) — a(t)? = [freer 


and thus, for s £ t, 


x(s) — x(t) 


s—t 


2 l y 
< ‘(r)? d 
< piq| fore 


Now we fix ¢ € (a,b) and write 
b 
/ x(s) ~ v(t) 
s—t 


t 


[2 freparas z / 


2 
As(t, s) ds 


—t 


As(t s) / |’ (7)|? dr ds 
s 
‘ 


where 
s 
[3 ar, a<s<t, hasgty Pee 
= a _ max(d,t—s)’ -”% 
Ag(t,s) = Ds ees ee -~ 
f al dt, t<s<b, Tl tnax(0,s—f)’ a" 
s 
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Finally, we estimate 


b b b : 
f [Sa Med aea < fier [Aste syat re 


[ Asles)at < c forall s € (a,b) andd >0 


and 


which is seen by elementary integration. 


From these error estimates, we observe that the rate of convergence depends 
on the magnitude of ¢,, that is, how well the kernels approximate the delta 
distribution. Finally, we study the question of convergence for n — oo. 


Theorem 3.34 Assume that {k; : 7 © N} is linearly independent and dense in 
L?(a,b). Then 
ll€n|loo 20 forn—->o. 


Proof: For fixed t € [a,}] and arbitrary 6 € (0,(b— a)/2), we define 


b 
1 
Oe eee we fi i 
o(s) : { 6, Issel ee and vu(s) : o(7) dr 0(s). 


Then v € L?(a,b) and fev s)ds = 1. Because |) X;, is dense in L?(a,b), there 
exists a peauence Z € i: gos on + v in L?(a,b). This implies also that 
pra n(s) ds > fev s)ds = 1. Therefore, the functions 


Un I= [[ ats) a5] “an € Xn 


converge to v in L?(a,b) and are normalized by be Un(s)ds = 1. Thus vp, is 
admissible, and we conclude that 


b 


Blt)? [ls-#P enls)Pas 


x 


b b 
= [ls-#Puls)?as + 2 f \s— EP v(s)[en(s) — v(s)] ds 


A 
> 
| 
Q 


b 2 
)| f 20) as| + (b—a)?[2 ells lon — olla + [loa — 232] 


a 
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This shows that 


b 
-1 
limsupe,(t) < Vb-—a fe) as for all t € [a, 6). 
noo 

Direct computation yields 


b 
[eas > c+ {Ind 


for some c independent of 6; thus 


Vb-a 
li see es ee (b— 
an SsuD (t) < e+ [ind] for all 6 € (0, (b— a)/2) 


This yields pointwise convergence, that is, €,(t) > 0 (n — oo) for every t € 
a, b|. Because e,,(¢) is monotonic with respect to n, Dini’s well-known theorem 
from classical analysis (see, for example, [231]) yields uniform convergence. 


For further aspects of the Backus—Gilbert method, we refer to [32, 114, 144, 
162,.163, 244, 272, 273]. 


3.7 Problems 


3.1 Let Qn : Cla, b] > Si(ti,...,tn) be the interpolation operator from Exam- 
ple 3.3. Prove that ||Qn||c(cja,t}) = 1 and derive an estimate of the form 


|Qn2— tlle < ch|la’llec 
for x € C1[a,b], where h = max{t; — tj-1:t=2,...,n}. 


3.2 Let K : X — X be self-adjoint and positive definite and let y € X. 
Define (x) = (Ka,x)x —2Re(y,x)x for x € X. Prove that 2* € X isa 
minimum of w if and only if x* solves Ka* = y. 


3.3 Define the space X,, by 
xX, = {> ae :a; ec} 
lj|<n 


and let P,, : L?(0,27) + X;, be the orthogonal projection operator. Prove 
that for r > s there exists c > 0 such that 


Il Pn ler en” *||bn| las 


per — per 


1 
|Prw = vas < Cc poe elles for all w E EL age 27) : 


per 


for all Wp € Xn, 
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3.5 


3.6 


3.7 


3.8 
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Show that the worst-case error of Symm’s equation under the information 
|\las < E for some s > 0 is given by 


per — 


F(6,E, il - Ils...) < ¢6s/(sth) | 


Let 2 Cc R? be the disk of radius a = exp(—1/2). Then w = 1 is the 
unique solution of Symm’s integral equation (3.57) for f = 1. Compute 
explicitly the errors of the least squares solution, the dual least squares 
solution, and the Bubnov—Galerkin solution as in Section 3.3, and verify 
that the error estimates of Theorem 3.18 are asymptotically sharp. 


Let th = k/n, k = 1,...,n, be equidistant collocation points. Let X,, be 
the space of piecewise constant functions as in (3.81) and P,, : L?(0,1) > 
Xp be the orthogonal projection operator. Prove that JX, is dense in 
L7(0,1) and 

1 

lc — Prollzz2 < —|le'llz 

n 

for all « € H'(0,1) (see Problem 3.1). 


Show that the moment solution can also be interpreted as the solution of 
a dual least squares method. 


Consider moment collocation of the equation 


[ros = y(t), te [0,1], 


in the space X,, = Si(t1,...,¢n) of linear splines. Show that the moment 
solution x, coincides with the two-sided difference quotient, that is, 


En(t;) = 5 [y(tj41 + h) = y(tj-1 _ h)] s 


where h = 1/n. Derive an error estimate for ||2° —a'|| 2 as in Example 3.23. 


Check for 
updates 


Chapter 4 


Nonlinear Inverse Problems 


In the previous chapters, we considered linear problems which we wrote as Kaz = 
y, where K was a linear and (often) compact operator between Hilbert spaces. 
Needless to say that most problems in applications are nonlinear. For example, 
even in the case of a linear differential equation of the form —w”’ + cu = f for 
the function wu the dependence of u on the parameter function c is nonlinear; 
that is, the mapping c+> uw is nonlinear.'! In Chapters 5, 6, and 7 we will study 
particular nonlinear problems to determine parameters of an ordinary or partial 
differential equation from the knowledge of the solution. Although we believe 
that the best strategies for solving nonlinear problems are intrinsically linked to 
the particular nature of the underlying problem, there are general methods for 
solving these nonlinear problems if they can be written in the form K(x) = y, 
where K is now a nonlinear operator between Hilbert spaces or Banach spaces. 
Guided by the structure of Chapter 2, we will study the nonlinear form of 
the Tikhonov regularization in Section 4.2 and the extension of the Landweber 
method in Section 4.3. Since the investigation of the latter one is already rather 
complicated, we do not present the extension of the conjugate gradient method 
or methods of Newton type, but refer the interested reader to the monograph 
[149] of Kaltenbacher, Neubauer, and Scherzer. 

We start this chapter with a clarification of the notion of ill-posedness in 
Section 4.1 and its relation to the ill-posedness of the linearized problem. In 
Section 4.2, we study Tikhonov’s regularization method for nonlinear problems. 
In contrast to the linear case, the question of existence of a global minimum 
of the Tikhonov functional is not obvious and requires more advanced tools 
from functional analysis, in particular, on weak topologies. We include the 
arguments in Subsection 4.2.1 but note already here that this Subsection is 
not central for the understanding of the further theory. It can be omitted 
because we formulate the existence of a minimum also as Assumption 4.9 at 
the beginning of Subsection 4.2.2. After the application of the general theory 
to the abovementioned parameter identification problem for a boundary value 


1The differential equation has to be complemented by initial or boundary conditions, of 
course. 
© Springer Nature Switzerland AG 2021 119 
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problem for an ordinary differential equation, in Subsection 4.2.4, we present 
some of the basic ideas for Tikhonov’s method in Banach spaces and more 
general metrics to penalize the discrepancy and to measure the error. As a 
particular example, we consider the determination of a sparse solution of a 
linear equation. Further, tools from convex analysis are needed, such as the 
subdifferential and the Bregman distance. 

Finally, in Section 4.3, we return to the Hilbert space setting and extend the 
Landweber method from Sections 2.3 and 2.6 to the nonlinear case. 


4.1 Local Illposedness 


In this chapter, we assume that X and Y are normed spaces (in most cases 
Hilbert spaces), K : X D> D(k) > Y a nonlinear mapping with domain of 
definition D(K) Cc X. Let x* € D(K) and y* € Y such that K(a*) = y*. 
It is the aim—as in the linear case—to determine an approximate solution to 
x* when the right-hand side y* is perturbed; that is, replaced by y® € Y with 
\ly° — y*\ly <6. In the following, let B(x,r) = {z: |la — z||x <r} denote the 
open ball centered at x with radius r. The following notion of local ill-posedness 
goes back to Hofmann and Scherzer (see, e.g., [137]). 
Definition 4.1 Let x* € D(K) and y* = K(a*). The equation K(x) = y is 
called locally improperly-posed or locally ill-posed at x* if for any sufficiently 
small p > 0 there exists a sequence tn € D(K)M B(a*,p) such that K(2n) > 
K(a*) but (a) does not converge to x* as n tends to infinity. 
Example 4.2 
Let k € C1((0,1] x [0,1] x R), k = k(t,s,r), and let there exist c; > 0 with 
|Ok(t, s,r)/Or| < ey for all (t,s,r) € [0,1] x [0,1] x R. Define 
1 
K(a)(t) = [Blt.s29)) ds, te€[0,1], forz€L(0,1). 
0 
Then K is well-defined from L7(0,1) into itself, and the equation K(x) = y is 
locally ill-posed at x = 0. 
Proof: Let x € L?(0,1). The application of the fundamental theorem of 
calculus in the form 


fk 
k(t,s,r) = k(t,s,0) + [lteter réR, (4.1) 
4 i 
0 


implies |k(t,s,r)| < |k(t,s,0)| + ci|r|, thus k(t, s,r) |” < 2|k(t, , 0)|” + 2c¢r?, 
thus, using the Cauchy-Schwarz inequality, 


1 
al 


IK (a(t)? < [les.208)) Pas < 2 f [|é(t, s,0)|? + 2 2(s)?] ds 


0 
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for allt € [0,1]. Therefore, K(x) is measurable and |K ()(t) |’ is bounded which 
implies K(x) € L?(0,1). 


Let now p > 0 be arbitrary and a, (t) = p¥2n + 1t", t € [0,1]. Then ||xn||Z2(0,1) = 
1 

p?(2n +1) f t?"dt = p? and, with (4.1), 
0 


1 


| (an)(t) — K(0)( < <a fits )| ds - 2n+ viet fs! ‘ds 
0 
cipV2n+ 1 


=> —  — > 0 anon. 
n+1 


Therefore, the equation K(x) = y is locally improperly-posed at x = 0. 
A second, and more concrete, example is formulated as Problem 4.2. 


If K is continuously Fréchet-differentiable at x* with Lipschitz continuous deriva- 
tive then local illposedness of the nonlinear problem implies the illposedness of 
the linearization. 


Theorem 4.3 Let K be Fréchet-differentiable in the ball B(a*, 6) and let there 
exists y > 0 with || K'(a2)—K'(2*)||ccx,v) < yl|e—-2*||x for ale € B(a*, p). Let 
the equation K(x) = y be locally ill-posed at x*. Then K'(x*) is not boundedly 
invertible; that is, the linear equation K'(a*)h = z is also ill-posed. 


Proof: | We assume on the contrary that K’(«*) is boundedly invertible and 
choose p € (0, /) such that 4 ||K’(2*)~"\|c(y.x) =: q < 1. For this p we choose 
a sequence x, € B(a*,p) by Definition 4.1. Lemma A.63 of the Appendix A.7 
implies the representation 


K (an) — K(x*) = K"(x*)(an—2") + rt, with [rally < > lien — 2" 
for all n € N; that is, 


K'(x*)~"|K(an) — K(a*)] = t, —2* + K'(2*)'r, for allneéeN, 


thus 
len —e*lx SK (e*)"[K(en) -— K(@")] |, + Ke) all 
< (K(2*) Mea) (a) — K(2*)lly 
+ 2 K(2*)Mlec,x) Ilan — 2*Il% 
< 1K (0*)Mleex) IK (en) — K(@*JIly + allan — 2° llx, 
thus 


(1-4) |lan—2*llx < ||K'(@*)“"llecy,x) |K(@n) -— K(2*)lly 5 
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and this expression converges to zero because of K(2,) + K(2*). This contra- 
dicts (a,) does not converge to x*. 


We will see in Section 4.3 that also the reverse assertion is true provided an 
addition condition (“tangential cone condition”) is satisfied, see Lemma 4.35. 


4.2 The Nonlinear Tikhonov Regularization 


In this section (except of Subsection 4.2.4), we assume that X and Y are Hilbert 
spaces with inner products (-,-)x and (-,-)y, respectively, and corresponding 
norms. Let again «* € D(K) and y* € Y with K(a*) = y* and y® € Y with 
yo — y*lly <6. Let, in addition, @ € X be given which is thought of being an 
approximation of the true solution z*. We define the Tikhonov functional by 


Jq,(t) = ||K(x) —y? || + alle— alk, 2 e D(K). (4.2) 


In the first subsection, we will discuss briefly the question of existence of minima 
of the Tikhonov functional and stability with respect to perturbation of y°. This 
part needs some knowledge of the weak topology in Hilbert spaces. We have 
collected the needed results in Section A.9 of the Appendix for the convenience 
of the reader. One can easily drop this subsection if one is not familiar (or 
interested) with this part of functional analysis. The further analysis is quite 
independent of this subsection. 


4.2.1 Existence of Solutions and Stability 


We recall the following definition (see remark following Definition A.75 of the 
Appendix A.9). 


Definition 4.4 Let X be a Hilbert space. A sequence (a) in X is said to 
converge weakly tox € X tf limn+oo(@n, 2) x = (L,2)x for allz EX. 

If Y is a second Hilbert space and K : X D D(K) > Y its a (nonlinear) map- 
ping then K is called weak-to-weak continuous, if K maps weakly convergent 
sequences in D(X) into weakly convergent sequences in Y. 


We will need the following two results (see Corollary A.78 and part (d) of 
Theorem A.76). 


Theorem 4.5 Let X be a Hilbert space. 


(a) Every bounded sequence (a) in X contains a weak accumulation point; 
that is, a weakly convergent subsequence. 


(b) Every convex and closed set U C X is also weakly closed; that is, if the 
sequence (a) in U converges weakly to some « € X then necessarily 
xeu. 
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The norm function || - ||. fails to be weakly continuous but has the following 
property which is sometimes called “weak lower semi-continuity” . 


Lemma 4.6 Let X be a Hilbert space and let the sequence (a) converge weakly 
to x. Then 


liminf ||t, —z||x > |lv—z\lx forallzex. 
n—-+Cco 


Proof: This follows from the formula 


len — 21x — IIe@-2llk = 2Re(ty, -—2,2-2z)x + lan -2llx 


> 2Re(a, —2,2—-—2z)x 


because the right hand side of the inequality tends to zero as n tends to infinity. 


Now we are able to prove the existence of minima of the Tikhonov functional 
under appropriate assumptions. 


Theorem 4.7 Let X and Y be Hilbert spaces, yo €Y, and K: X > D(K) > Y 
weak-to-weak continuous with conver and closed domain of definition D(K). 
Then there exists a global minimum of Ja,s, defined in (4.2), on D(K) for all 
a>0. 


Proof: Let z, € D(K) be a minimizing sequence; that is, Joa,5(&n) 4 J* = 
inf{Jy5(@): « € DUK)} as n > oo. From 


allt, — 2% < Jae(tn) < J* +1 


for sufficiently large n, we conclude that the sequence (x,) is bounded. By 
part (a) of Theorem 4.5, there exists a subsequence—which we also denote by 
(%»)— which converges weakly to some %. Also, ¥ € D(K) by part (b) of this 
theorem. Furthermore, K(x,,) converges weakly to A (%) by the assumption on 
kk. Lemma 4.6 implies that liminf tn —4||x > ||E-||x and lim inf || (an) — 
y\ly > ||K(®) — y°\ly. Therefore, for any ¢ > 0 there exists N € N such that 
for alln > N 


az — alk + K(z) —9° lly 
allan — &l% + |K (an) — I + € = Jos(fn) + € 
J* + Qe. 


Jo.,5(E) 


IAN IA 


This holds for all « > 0, thus J* < Ja,5(%) < J* which proves that 7 is a 
minimum of Jo,5. 


In the same way one proves stability. 
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Theorem 4.8 Let the assumptions of the previous theorem hold, and in addi- 
tion, (Yn) be a sequence with yn —> y° asin — oo. Let an € D(K) be a minimum 
of Jn(x) :=allx—2||% + || K(x) —yn||? on D(K). Then there exist weak accu- 
mulation points of the sequence (a), and every weak accumulation point & is a 
minimum of Jos. 

If in addition, K is weak-to-norm continuous, then every weak accumulation 
point & of (ap) ts also an accumulation point with respect to the norm. 


Proof: The estimate 


(.K(2) - y'lly + lly? — yally)” 
< (IK(@)-y*lly +1)° 


allan — El + |K(2n) — yall¥ 


IA 


for sufficiently large n implies again that the sequence (2,,) is bounded. Thus, 
it contains weak accumulation points by Theorem 4.5. Let Z € D(K) be a weak 
accumulation point, without loss of generality let x, itself converge weakly 
to % Then K(x) — yn converges weakly to K(Z) — y® and thus as before 
liming lly — |x > lle — 4l|x and liminf ||K (xn) — ynlly > ||K(@) — y°lly. Set 
J* = Jyn(Xn). Then, for any ¢ > 0, there exists N € N such that for all n > N 
and all « € D(K) 


allan — |x + |K(an)— nll +e = Inte (43) 
In(a) +e = alle — all + ||K(z)— nll + €- 


J,5(£) 


ss 
< 


Now we let n tend to infinity. This yields 


Jos(@) < liminf Ji +e < liminf Ji+e < Jos(x) +e. 


n—->co n—->co 


Letting ¢ tend to zero proves the optimality of % and also the convergence of 
J* to Ju,5(E). 
Let now K be weak-to-norm continuous. From 


allan — 21 — ||E — 411) = In — Ja,6(Z) + | (2) — yl — Kan) — gall 


and the convergence of K(x) to K(%), we conclude that the right-hand side 
converges to zero and thus ||z,,—2||x — ||Z—4||x. Finally, the binomial formula 
yields 


Il(tn — #) — (@ — 8) IIx 


= |len — lk + lB — alk — 2 Re(an — 2,5 — 2) x 


len — Ell 


which tends to zero. 


4.2.2 Source Conditions And Convergence Rates 


We make the following general assumption. 
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Assumption 4.9 (a) D(K) is open, and x* € D(K)N B(4,p) is a solu- 
tion of K(x) = y* in some ball B(&,p), and K is continuously Fréchet- 
differentiable on D(X) N B(&, p), 


(b) K'(x2*) is compact from X into Y. 


(c) The Tikhonov functional Jo,s possesses global minima x € D(K) on 
D(K) for all a,6 > 0; that is, 


alle? — 8) + ||K (2) — yl < alla — al) + K(x) — 9°} (44) 


for allx € D(K). 


We refer to the previous subsection where we showed part (c) of this assumption 
under appropriate smoothness assumptions on K. 

Now we are able to formulate the condition which corresponds to the “source 
condition” in the linear case. We will include the more general form with index 
functions and introduce the following notion. 


Definition 4.10 Any monotonically increasing and continuous function  : 
(0, Omar] > R (for some bmax > 0) with p(0) = 0 is called an index function. 


The most prominent examples for index functions are y(t) = t? for any 
a > 0 but also y(t) = —1/Int for 0 << t < 1 (and y(0) = 0). By calculating the 
second derivative, one observes that the latter one is concave on [0, 1/e?] and 
the first class is concave whenever o < 1. The linear case y(t) = Gt for t > 0 is 
particularly important. 


Assumption 4.11 (Source condition) Let D(K) be open and «* € D(K) NM 
B(&, p) be a solution of K(x) = y* in some ball B(&, p) and let K be differen- 
tiable on D(K)N B(&, p). Furthermore, let y : [0, dmaxz] > R be a concave index 
function with Omaz = \|K"(*)||ccx,y)- 


(i) Let K" be locally Lipschitz continuous; that is, there exists y > 0 with 


|K"(@) — K"(a")\lexxy) < yle-2"l|x for alla € B(2*,p)ND(K), 


(ti) and there exists w € X with 


a*—& = o([(K'(a*)*(K'(@")?)w and y\ul|x <1. 


We note that for a linear compact operator A : X — Y (in the present 
case A := K’(x*)) the operator y([A*A]!/?) from X into itself is defined as 
in (A.47) by a singular system {1;,2,;,y; : 7 € J} for A, see Appendix A.6, 
Theorem A.57, where J is finite or J = N, namely, 


p([A*A}/?)z = So plus) (z,as)x a3, zEX. 
jEd 
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In the special case that y(t) = t? the condition reads as a* — @ = (A*A)°/?) w 
which is just the source condition of the linear case (for ¢ = 0). Again, for 
o = 1 the ranges R([(K’(x*)*(K’(2*)]!/?) and R(K'(x*)*) coincide which is 
seen from the singular system {u;,2;,y; : 7 € J}. Therefore, in this linear case 
part (ii) takes the form «* — ¢ = K'(x*)*v for some v € Y with y|lully < 1. 


In the past decade, a different kind of source conditions has been developed 
which does not need the derivative of K. It can be generalized to a wider 
class of Tikhonov functionals with non-differentiable K acting between Banach 
spaces. We refer to Subsection 4.2.4 for a short glimpse on these extensions. 


Assumption 4.12 (Variational Source condition) Let D(x) be open and x* € 
D(K) 9 B(&,p) be a solution of K(x) = y* in some ball B(%,p) and ¢ : 
(0, max] > R be a concave index function with max > sup{||K(x*) — K(z)|ly : 


xz € D(K)M B(&,p)}. Furthermore, there exists a constant 3 > 0 such that 


Blla* — all < lle - alk — Il2*- 2k + 9(\|K(2*) - K(2)lly) 
for all x € B(&,p)ND(K). 


This assumption is also known as a variational inequality (see, e.g., [245]). 
We prefer the notion of variational source condition as, e.g., in [96], because 
it takes the role of the source condition. We now show a relationship between 
Assumption 4.11 and Assumption 4.12. 


Theorem 4.13 Let D(K) be open and x* € D(K)N B(&,p) be a solution of 
K(x) = y* with p € (0,1/2) and ~: [0,dmaz] 4 R be a concave index function 
with dmax > ||K'(2*)|lecx,v) and dmax > sup{||K(a*) — K(2)|ly : 2 € D(K)N 
B(é,p)}. 

(a) The variational source condition of Assumption 4.12 is equivalent to the 


following condition: 
There exists 0 <o0 <1 such that 


2Re(u* —#,a*—2)x < a||x*—allk + 9(|K(x*) - K(a)lly) (45) 
for all x € B(&,p) ON D(K). 


(b) Let Assumption 4.11 hold and, in addition, p(t) = t for allt or 
||K'(a*)(x — a*)|ly < n(|K(x) — K(2*)|ly) for all x € B(&,p) D(K) 
where 7 is another concave index function. Then also Assumption 4.12 
holds with some index function ~ which is linear if yp is linear. 


(c) Let Assumption 4.12 hold for y(t) = Gt for allt > 0. Then a* —@ € 
*\ * * \ > * 1/2 
R(K'(a*)*) = R([K"(a*)*K"(a*)]'/). 


Proof: (a) From the elementary equation 


IIx — || — |la* -— 2) = 2Re(u* — 2,0 —2*)x + |x — 2*|lk 
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we observe that the variational source condition is equivalent to 


Blle—a*|x < 2Re(x* — 4,2 —a*)x +||z—2*||k + 9(|K(2*)-K(a)lly), 


that is, 


2Re(2* —é,2*-2)x < (1—-A)lle—a*||k + 9l|K(x*) - K(2)lly), 
which has the desired form with o =1—(if 6@<lando=Oif 6>1. 
(b) Set A := K’(a*) for abbreviation. Then, for « € B(%,p)ND(K), 
Re(y([A*A]!/?) w, ot at) 
Re(w, y([A* A]/?) (x* — 2) 
ll~llx ||p([A*A]”) (2* — 2) 


Re(a* — %,2* — x)x 


x 


IA 


lx: 


Now we use the fact that for any concave index function 
le((A*4]’) 2], < v(\lAzlly) for all z € X with |[z|_x <1. 
For a proof, we refer to Lemma A.73 of the Appendix. Therefore, 
Re(2* —@,a*—2)x < |lwllx (||K'(x*)(2* —2)lly). (4.6) 
If ||K"(x*)(@ — 2*)|ly < n(||K(a) — K(a*)|ly) for all x € B(&,p) D(K) then 
Re(2* —@,a*—2)x < |lwllx o(n(I|K(2) — K(2*)lly)) 


because of the monotonicity of y. This proves the estimate (4.5) with o = 0 
and @ = 2||w||x yon. Note that the composition of two concave index functions 
is again a concave index function. 
If y(t) = t for all t we use the estimate 
* * * ‘y, * 
K(x) — K(x") — K"(x"*)(@-2")ly < 5lle* - 2llx 

for all « € B(%,p) MN DUK) (see Lemma A.63 of the Appendix A.7), thus 
|K'(x*)(x — 2*)|ly < ||K(z) — K(x*)|ly + 3||c* — x||& and thus from (4.6) 

2Re(x* — #,a*—2)x < 2llwl|x||K(x*) — K(a))lly + llwllxy lle* — alk 


for all « € B(&, p) MN D(K) which proves the estimate (4.5) with o = y||w||x <1 
and ¢(t) = 2||w||xt for t > 0. 

(c) Let Assumption 4.12 hold for y(t) = (¢t for allt > 0. For any fixed z € X and 
t € K sufficiently small such that x := x* — tz © B(4,p)N D(K) we substitute 
x into (4.5) for y(t) = Bt which yields 


2Relt(x*—#,z)x] < [tPollzlx + B\|K(@* —tz) — K(2*)|ly 
By 
2 


< |éPollzlk + [BIA (e*)zlly + > le? llellx- 
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Dividing by |¢| and letting t tend to zero yields? |(a* —@, z)x| < B || Az||y for all 
z © X where again A = K’(2*). We show that x*—% belongs to the range of A*. 
First we note that x* — ¢ is orthogonal to the nullspace (A) of A. We choose 
a singular system {j;,2;,y; : 7 € J} for A, see Appendix A.6, Theorem A.57, 
where J is finite or J = N and expand «* — & in the form 2* — ¢ = a, Vj2j- 
(The component in the nullspace N(A) vanishes because x* — @ is orthogonal 
to N(A).) We set J, = J if J is finite and J, = {1,...,n} if J = N. For 
a=) ed, re a; the inequality |(a* — #,z)x| <§ 8 || Az|| is equivalent to 


(x3) ey , 


jee, jedn 


x? 
that is, Diez, ra <—— 
— &. 


Finally, A*w = a* 


This proves that w := >> iy; €Y is well-defined. 


jet . 


Under the Assumptions 4.9 and 4.12, we are able to prove convergence and 
also rates of convergence as in the linear theory. As we know from the linear 
theory there are (at least) two strategies to choose the regularization parameter 
a. To achieve the rate O(V6), we should choose a = a(6) to be proportional 
to 6 (a priori choice) or such that the “discrepancy” || K(«#°)*) — y°|ly to be 
proportional to 6 (a posteriori choice). For nonlinear operators, K essentially 
the same arguments as for linear operators (substitute « = % and x = x* into 
(4.4)) show that 


lim sup || (2*”) ~y'lly $8 and lim ||K(a*) —y? lly = ||K(@) — y°lly 
as nee 


for any choice of minimizers x%°. However, for nonlinear operators, the map- 
ping a +> ||K(2®*) — y*|ly is not necessarily continuous (see, e.g, [219] for a 
discussion of this topic). Therefore, the discrepancy principle is not well-defined 
unless more restrictive assumptions are made. In the following, we just take the 
possibility to choose the regularization parameter by the discrepancy principle 
as an assumption (see also [245], Section 4.1.2). 


Theorem 4.14 Let Assumptions 4.9 and 4.12 hold and let p > 2\|a* — #l|. 
(a) Let a = a(d) be chosen such that 
82 52 
c_—~ < a(d) < e518) for all 5 >0 
where c, > c_ > 0 are independent of 6 (a priori choice), 
(b) or assume that there exists ry, >r_ > 1 and a(d) > 0 such that 
r6 < ||K(a%) —y'lly < 46 for alld >0 


a(5),6 


(a posteriori choice) where x denotes a minimum of the Tikhonov 


functional Jo 5),5 on DK). 


2Note that the phases of t/|t| can be chosen arbitrarily! 
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Then 2°) € B(&, p) for sufficiently small 6 and 


6 atl = O(—(8)) and |K(2%) —ytly = O(6), 60. 


lla“ 


Proof: We show first that «%)5 € B(4,p) for sufficiently small 5. From 
(4.4) for a = a2*, we conclude that 


|K (a?) — y ||} + alle? — ak < & +alla* — aX (4.7a) 
2: 
< & tae. (4.7b) 


If c_ 20) < a(6) < cy ai we conclude that ||a%)? — #||2 < aa + a < 


(9) + eo < p” for sufficiently small 6. 
Tf the ries principle holds, then from (4.7b), 


2 
72.5? + a(d)||a? — al < 6 + a(8) 5, 


and thus a(6d)||2%)? — #2 < a(6d) ze because r_ > 1. This shows ||x%().? — 
&\|x <p and ends the first part of the proof. 

To show the error estimates, we use the variational source condition of Assump- 
tion 4.12 and (4.7a). 


|K(a™) — yl} + aBlla%? — a" < Ka) — yl} 
9° — all — alla* — alk + ap(||K(2*) — K(x**)|ly) 
< 8 + ag(||K(2*) — K(x™’)|ly) 
< & + agi(lly’ —K(a™)lly +6). a) 
Let first a(5) be chosen according to the discrepancy principle. Then 
(r2 —1)8 + a(6) Bla? — ak < a(6)o((r+ + D8) 
< a(d) (1+ r+) (9) 
where we have used that y(sd) < sy(0d) for all s > 1 (see Lemma A.73). This 
proves the assertion for ||2°) — 2*||x after division by a(d) and dropping 
the first term on the left hand side. The estimate for ||K(a%)°) — y*||y 
hese obviously from the triangle inequality because || K(«%)*) — y*|ly < 
| (a0?) — yy +6 <S (re +16. 
Let now c_ sn < a(d) < cy Ss. Substituting this into (4.8) and dropping 
the second term on the left hand side yields 


+ alla 


C467 
y(6) 


Now we set t = ||K(a*)) — y°||y/6 for abbreviation. Then the previous 
estimate reads as 


(a2?) — yl < #4 e(ly? — K(@”)||y +8). 


e@<it+ + 9((1+1)6) <14+e,(1+t) = l+ey + qt, 


= y(6) 
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where we used again y((1 + t)6) < (1+ #)(d). Completing the square yields 


2 
we substitute this and the bounds of a(6) into (4.8) again and drop the first 


term on the left hand side which yields 


ea . 
i< 4 V1 + cx + ; that is, ||K (2%) — y*||y < cd for some c > 0. Now 


6? 
— B ||) _ o*/2 << 6 4 y((1+0¢)d) < & + a(l+od& 
CL x x PG rc C. Cc 


which yields c_ ||x%)> — a*||2 < [1 + e4(1 +)] y(d) and ends the proof of 
the theorem. 


By Theorem 4.13, the special case y(t) = t corresponds to the source condition 
a* — @ € R((A*A)'/?) = R(A*) and leads to the order O(V6) just as in the 
linear case. The cases y(t) = t? for o > 1 are not covered by the previous 
theorem because these index functions are not concave anymore. 

For proving the analogue of Theorem 2.12 to get the optimal order of conver- 
gence up to O(62/*), we have to use the classical source condition of Assump- 
tion 4.11, which is the obvious extension of the one in the linear case, see 
Theorem 2.12. A variational source condition for this case is not available. We 
follow the approach in [92] (see also [260] for the original proof). 


Theorem 4.15 Let Assumptions 4.9 and 4.11 hold with p > 2||x*—Z||x and (ii) 
modified in the way that x*-&=( ‘(a*)* K' (x ees R((K "(a*)* — 
for some o € [1,2] andv € X such that y || (K’@*) 4 8) es alt 


where y denotes the Lipschitz constant from part (i) of Assumption ali We 
choose a(0) such that 


e_ Gilet) < a(d) < eget, 


Then 
jx) — ay = Ofer) , 60, 
| K(x) — y*Iy = O(67*D), 530. 
Proof: | We leave a > 0 arbitrary til the end of the proof. We set A := K’(a*) 


and choose a singular system {y;,2;,yj : j € J} for A, see Appendix A.6, 
Theorem A.57, where J is finite or J = N. Then we write «* — % as 


a*—@ = (A*A)7/?0 = eye zg; = A*w with w = = 1 Uy Y; 
jEJ Ged 


where v; = (v,z;)x are the expansion coefficients of v. Then 4||w||y <1. We 
define z € X by 
Za = 2 — a(A*At+al)'A*w; (4.9) 


that is, 


Zq = UB ay) ry iMG thus ||zq — z*||% = ey fr |v, |?. 


jet" I jet 
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Later, we will also need the form 


ae ma 
A(Zq — x* J)+taw = ad (af "eta ) UIYi = ope iu 


-~1 (2 
ue 
|AGa— 2") + awl} = “D( ) yl? 


With the elementary estimate (see Problem 4.3) 


ut 


a eG we. >0, 
eto t ba 


for t =o and t = o — 1, respectively, (where c; depends on t only) we obtain 


2a — a2 -< ea’, (4.10a) 
| Aa —a*) + awl? < c__fat. (4.10b) 


In particular, z converges to z* as a > 0 and is, therefore, in B(%, p) N D(K) 
for sufficiently small a. 
We use the optimality of 7; that is (4.4), for 2 = z to obtain 


6 6 ,0 “ 6 a 
| (a?) — yl} + alla? — Bll < Ke) — 9 lly + allza — 4l|%- 


With 
lle? — a = |la*— al + 2Re(e? —2*,2*— 2), + |lx*? —2*|% 
= |la* — alk + 2Re(A(a®? —2*),w), + ae — 2° ||k, 
Iza - BI = [z* 2) + 2Re(A(za—2*),w)y + lea — 2° |X 
we obtain 


(a) — yl} + 2aRe(w, A(a™ —2*)), + alla? — 2*||% 


< |K(zo) —y? |} + 2aRe(w, A(za — 2*))y + allza — 2" ||, 
and thus 
| K(x?) — y? + awl} + alla? — x*||X 
< a? |lwl||? + 2aRe(w, K(x™?) — y° — A(x? — 2*)) 
+||K (za) — y° ||} + 2a Re(w, A(za — 2*))y + a||2a — 2" || 


Now we use 


K(a®*) = K(@\+4+ A(r®—2%) + 7%? = y* + A(e®?—a*) + 1%? and 
K(Zq) = K(a*) + A(zg—2*) + sg = y*® + AlZao—2") + Sq 
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with ||r°9||y < 2|/2%? — x*||% and ||sally < 2||za — x*||% and obtain 


K(x?) — y? +awll} + alja%? — 2*||k 

< a’ wll} + 2aRe(w,y*—y°), + 2aRe(w,r), 
y =o + Ale, — 2") + 94/2 4 2a Re(w, A(za — 2*)), 
+ allza — 2* || 


= a?|wll? + 2a Re(w,y* — y°) 


iw) 


a,d 
aca hee 
+ |ly* — yl} + 2Re(A(za—2*) + 5a,y* —y’)y 
| A(za — 2*) + sq|[2- 4 2a Re(w, A(za ae + allza — 2*||2 
= 2Re(A(zq — 2*) + aw, y* -y)y + 2Re(sy,y* — y*)y 
+ a? |lwl|} + 2aRe(w,r), + ||A(za — 2") + sally 


+ 2aRe(w, A(zq — &*) + Sa)y — 2aRe(w, sa)y 


¥ 


+ allzq — 2* || 
= 2Re(A(zq—2*) +0w,y* — ee + 2Re(sa,y* — y*)y 
+ 2a Re(w,r™*) + ||A(zo — 2*) + ow + sq? 


— 2aRe(w,sa)y + allze —2*||% 
265||A(zq — 2*) +awlly + 76||zq — 2*||% 

+ ay|lwlly||2%? — 2k + 2\|A(za — 2*) + awl} 
2 
Y 

5 la "| + ayllwlly lize — 2" | + allza— 2*||k. 


IA 


Now we use that y||w||y < 1 and thus 


[K(a*?) —y? +awlly + a(1—yllwlly)lle*? — 2° [Ik 


< 26||A(z. —2*) + awlly + 2||A(za — 2*) + aw]? 
2 
a4 * * 
t la — 2" |\% + (5 + @y|lwlly + a) [12a — 27 || - 
2 


So far, we have not used the definition of z,. We substitute the estimates 
(4.10a), (4.10b) and arrive at 


|K(a%?) — y° + awl} + a(1 — Ylwlly)|le~? - 2* || 
< o[salern/2 + aft! + a + 50°] 


for some c > 0. Dropping one of the terms on the left hand side yields 


IK (e*?) — yl SK (2%) —y? + awl} + 207 |wI\} 
< ¢ [6 aleH0)/2 + art + a2 da? 4 o?| , 
Ix _ a*||% < ¢ [6ae- 0/2 +art a2ert 4 ba] 
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for some c > 0. The choice c_6?/(7+) < a(5) < c46?/("+) yields the desired 
result (note that 1 <o < 2). 


We note that the modified defect || K(a*)®)—y*+a(d)wly satisfies || K (2%?) — 
y* +.a(d)wlly < cd (see Problem 4.4). 


In the next subsection, we apply the result to the Tikhonov regularization of a 
parameter identification problem. 


4.2.3. A Parameter-Identification Problem 


Let f € L7(0,1) be given. It is the aim to determine the parameter function 
c € L7(0,1), ¢ > 0 on (0,1), in the boundary value problem 
—u'"(t) + c(t)u(t) = f(t), 0<t<1, u(0)=u(1)=0, (4.11) 


from perturbed data u°(t). We recall the Sobolev spaces H?(0,1) from (1.24) 
as the spaces 


H?(0,1) = {u € CP-10, 1] : u®-Y(z) = at [ ¥(s) ds, a€R, WE L*(0, i)} 
0 


and set u\?) := w for the pth derivative. Note that H?(0,1) C C[0,1] for 
p > 1 by definition. Then the differential equation of (4.11) for u € H?(0,1) is 
understood in the L?—sense. It is an easy exercise (see Problem 4.7) to show 
that ||ulloo < |lu’|[z2(0,1) for all wu € H1(0,1) with u(0) = 0. First, we consider 
the direct problem and show that the boundary value problem is equivalent to 
an integral equation of the second kind. 

Lemma 4.16 Let f € L°(0,1) and c € L4(0,1) := {ce € L°(0,1) : ¢ > 
0 almost everywhere on (0,1)}. 


(a) If u € H?(0,1) solves (4.11) then u solves the integral equation 
1 1 
u(t) + i a(t, s) e(s) u(s)ds = if a(t,s) f(s)ds, t€ [0,1], (4.12) 
0 0 


_f s(l-t), 0<s<t<1, 
er { t(1—s), 0<t<s<l. 
(b) We define the integral operator G : L?(0,1) + L?(0,1) by 
1 : 1 
(Gv)(t) = Jats) u(s)ds = a-y [ su(s)ds + t [(-s)u(s) ds, 
) 
0 t 


t € (0,1), v € £7(0,1). The operator G is bounded from L?(0,1) into 
H?(0,1) and (Gv)" = —v. 
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(c) If wu € C[0,1] is a solution of (4.12); that is, of u+ G(cu) = Gf, then 
u € H?(0,1) and u is a solution of (4.11). Note that the right-hand side 
of (4.12) is continuous because Gf € H?(0,1). 


Proof: (a) Let wu € H?(0,1) solve (4.11) and set h = f —cu. Then h € 
L7(0,1) (because u is continuous) and —u” = h. Integrating this equation 
twice and using the boundary conditions u(0) = u(1) = 0 yields the assertion 
(see Problem 4.5). 

(b) Let v € L7(0,1) and set 

t 1 

u(t) = (Gv)(t) = (1-1) f su(s)as + t [a -s)v(s)as, t € [0,1], 
0 


t 
t 


— [ v(s)as + foo—syuioas t € (0,1). 
0 


0 


a 
= 
I 


Then it is easy to see that u(t) = i, w(s) ds. Therefore, wu € H'(0,1) and 
u’ = w. From the definitions of ~ and H?(0,1), we observe that u € H?(0,1) 
and v = —u". 

(c) Let now u € C[0, 1] be a solution of (4.12) and set again h = f — cu. Then 
again h € L7(0,1), and wu has the representation u = Gh. By part (b), we 
conclude that u € H?(0,1) and —u” =h= f —cu. 


Theorem 4.17 The integral equation (4.12) and the boundary value problem 
(4.11) are uniquely solvable for all f € L?(0,1) and c € L2.(0,1). Furthermore, 
there exists y > 0 (independent of f and c) such that |\ul|H20,1) < y¥(1+ 


llellz2(0,1)) If llz20,1)- 
Proof: By the previous lemma we have to study the integral equation 


u+ G(cu) = Gf (4.13) 


with the integral operator G with kernel g from the previous lomma. The 
operator T’: u+> G(cu) is bounded from C{0, 1] into H?(0,1) and thus compact 
from C(O, 1] into itself (see again Problem 4.5). Now we use the following result 
from linear functional analysis (see Theorem A.36 of the Appendix A.3): If the 
homogeneous linear equation u+ Tu = 0 with the compact operator 7 from 
C[0, 1] into itself admits only the trivial solution u = 0 then the inhomogeneous 
equation u + Tu = h is uniquely solvable for all h € C0, 1], and the solution 
depends continuously on h. In other words, if [+T is one-to-one then also onto 
and I + T is boundedly invertible. Therefore, we have to show injectivity of 
I+T in C[0,1]. Let u € C[0, 1] solve (4.13) for h = 0. Then u € H?(0,1) and u 
solves (4.11) for f = 0 by the previous lemma. Multiplication of (4.11) by u(é) 
and integration yields 


= few + c(t) u(t)] u(t) dt = [ew + c(t) u(t)?] dt, 
0 0 
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where we used partial integration and the fact that u vanishes at the boundary 
of [0,1]. Since c > 0 we conclude that u’ vanishes on [0,1]. Therefore, wu is 
constant and thus zero because of the boundary conditions. Therefore, J + T 
is one-to-one and thus invertible. This shows that (4.11) is uniquely solvable 
in H?(0,1) for every f,c € L?(0,1) with c > 0 almost everywhere on (0,1). 
In order to derive the explicit estimate for ||u||72(9,1) we observe first that 
any solution u € H?(0,1) of (4.11) satisfies Ile’ II¢2¢0,1) < ||fllz200,1) !ullz2(0,1)- 
Indeed, this follows by multiplication of the differential equation by u(t) and 
integration: 


Z ull(t) u(t) dt + / c(t) u(t)? dt = / f@)ult)at < [WFlleroayllullzzo.- 
0 0 0 


Partial integration and the assumption c(t) > 0 yields the estimate ||u’||7. (0,1) < 
Il fllz2(0,1) [lull 22(0,1)- With |lulloo < |lu'llz2¢0,1) and |ullz2(0.1) < |[ulloo this 
implies that ||ul|o. < || f||z2(0,1)-. Therefore 


lella200,12) = WG(f—cu)|lz20,1) < WGllc¢c2:0,1),#2(0,1)) If — cull 220,12) 
Gl c(220,1),#2(0,1)) [If llz20,1) + llellz2(0,1)lulloo] 


IA IA 


(1 + Ilellz2(0,1)) Gl c¢22(0,1),#2(0,1)) If llz2(0,1) « 


We can even show existence and uniqueness for c from a small open neighbor- 
hood of L? (0,1) = {c € L?(0,1) :c > 0 on (0,1)}. 

Corollary 4.18 There exists 6 > 0 such that the boundary value problem (4.11) 
is uniquely solvable for all f € L?(0,1) and c € Us where 


L? (0,1), h € L7(0,1) 
Seether i ne 2 oer? 
Us = fomathertOd): 4 fedwon)lileran <8 
Furthermore, there exists y > 0 such that ||ul| 2(0,1) < ¥ (1+llellz2(0,1)) Il fll z2(0,1) 
for all f € L7(0,1) and c € Us. 


Proof: Let K., : L?(0,1) + H?(0,1) be the operator f ++ u where u is the 
solution of (4.11) for c; € L7.(0,1). Let c= c,; +h € Us. We consider the fixed 
point equation & + K., (hi) = K., f for i € H?(0,1). We have for v € H?(0,1) 
that 

|| Ke, (Av) ||42(0,1) (1 + llea|lz20,1)) Ihullz2¢0,1) 
(1+ [lealizaca.ny) WAllea(oay lies = 6llollezfoay: 


< 
< 


For 6 < 1/7 we observe that v ++ K,,(hv) is a contraction and, by the Con- 
traction Theorem A.31, i+ K,, (hi) = K,, f has a unique solution & € H?(0, 1) 
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and 
~ 1 1+ |ler||z2(0,1 
I|@\| x22(0,1) sS = | Ke, fll 42(0,1) 4) a a II fllz2(0,1) 
1+ lle: + Al|z2(0,14) +8 
= ia aa De I fllz20,1 
14+6 
SY (1 + |ellz2(0,1)) If llz2(0,1) 


1— dy 


because ||h||72(0,1) < 6. Finally, we note that the equation i = Ke, f — Ke, (hi) 
is equivalent to —w” + (c, + h)t = f. 


We note that the set Us is an open set containing L* (0,1) (see Problem 4.5). 
Therefore, K can be extended to the set Us. As a next step towards the inverse 
problem, we show that the nonlinear mapping K : c+ wu is continuous and even 
differentiable. 


Theorem 4.19 Let Us > L4(0,1) be as in the previous corollary and let K : 
Us — H?(0,1) defined by K(c) = u where u € H?(0,1) solves the boundary 
value problem (4.11). Then K is continuous and even differentiable in every 
c € Us. The derivative is given by K'(c)h = v where v € H?(0,1) is the solution 
of the boundary value problem 


—v"(t) + c(t)v(t) = —A(thu(t), O<t<1, v(0)=v(1)=0, (4.14) 
and u € H?(0,1) is the solution of (4.11) for c; that is, u= K(c). 


Proof: Let h € L*(0,1) such that c+ h € Us and let i = K(c+h). Then u 
and wu satisfy 


—i"+(ceth)i = f, —u"+(cth)u = fthu, a0) = a(1) = u(0) = u(1) =0, 


respectively. We subtract both equations which yields —(a@—u)"+(c+h)(t—u) = 
—hu. The stability estimate yields 


Ila — ull #2 (0,1) < 7 (1 + ler Allz2(0,1)) || hel] 20,1) 
< y(1+ Ilellz2¢0,1) + 9) ||Pllz2(0,1)||Ulloo 


which proves continuity (even Lipschitz continuity on bounded sets for c). 


For the differentiability, we just subtract the equations for u and v from the one 
for u 


—(w-u-—v)" + (c+h)(t-—u-—v) = —hv, (4.15) 
with homogeneous boundary conditions. The stability estimate yields 


(1+ lle + Allz2(0,1)) hel] 22(0,1) 


i-u-vlzon < 
< y(t lellz2(0,1) + 9) llAllz2(0,1) lle loo « 
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The stability estimate applied to v yields 


¥ (1 + llellz2(0,1)) [hull 22(0,1) 
Y (1 Tr IIellz2(0,1)) |P|| 220,12) [lU4lloo 


Ilolloo < llullae(o.1) 


< 
< 


which altogether ends up to 


as 2 
[a —u—vllH2001) < ~ (1+ llellz2(0,1) + 6) Ill Z2¢0,1)lIAlloo - 


This proves differentiability. 


In the following, we consider the parameter-to-solution map K as a mapping 
from Us Cc L?(0,1) into L?(0,1) instead of H?(0,1). Then, of course, K is also 
differentiable with respect to this space with the same derivative. 


From the theory, we know that for Assumption 4.11, we need the adjoint of 
K'(c). 

Lemma 4.20 The adjoint operator K'(c)* : L7(0,1) + L?(0,1) is given by 
K'(c)*w = —uy where u = K(c), andy € H?(0,1) solves the following boundary 
value problem (the “adjoint problem”): 


—y"(t) + e(t)y(t) = w(t), O<t<1, y(0)=y(1)=0, (4.16) 


Proof: Let h,w € L?(0,1) and v the solution of (4.14) for c = c* and y the 
solution of (4.16). By partial integration we compute 


(K'()h, w) p>¢@.1 v(t) [-u" (t) + e(t) y()] dt 


ee et 


[-v"(t) + c(t) o()] y(t) dt. = - [me u(t) y(t) dt 
0 


= —(h, wy) £2(0,1) 2 


This proves the assertion. 


Now we can formulate condition (ii) of Assumption 4.11 for a linear index 
function ¢y: 

The existence of w € L?(0,1) with c* — é = K'(c*)*w is equivalent to the 
existence of y € H?(0,1) with y(0) = y(1) = 0 and c* — é = —u*y. Therefore, 
the condition is equivalent to 


cc 


© € H?(0,1)n H2(0,1). 


This includes smoothness of c* — ¢ as well as a sufficiently strong boundary 
condition because also u* vanishes at the boundary of [0, 1]. 
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4.2.4 A Glimpse on Extensions to Banach Spaces 


Recalling the classical Tikhonov functional Jy,5 in Hilbert spaces from (4.2) we 
observe that the first term measures the misfit in the equation while the second 
part serves as a penalty term. The error 2%? — z* in the solution is measured in 
a third metric. In many cases, the canonical space for the unknown quantity x 
is only a Banach space rather than a Hilbert space. For example, in parameter 
identification problems as in the previous subsection the canonical space for the 
parameters are L°—spaces rather than L?—spaces. The Hilbert space setting 
in the previous subsection only works because the pointwise multiplication is 
continuous as a mapping from L? x H? into L?. For more general partial 
differential equation this is not always true. For an elaborate motivation why 
to use Banach space settings, we refer to Chapter I of the excellent monograph 
[245] by Schuster, Kaltenbacher, Hofmann, and Kazimierski. 


In this subsection, we will get the flavor of some aspects of this theory. Let 
X and Y be Banach spaces, K : X > D(K) > Y a (nonlinear) operator with 
the domain of definition D( kK) C X where D(K) is again convex and closed, 
and let «* € D(K) be the exact solution of K(x) = y* for some y*. In the 
following we fix this pair x*, y*. As before, y* is perturbed by y® € Y such 
that |/y° — y*|ly < 6 for all 6 € (0,dmax) for some dmaz > 0. We note that 
we measure the error in the data with respect to the Banach space norm. The 
penalty term ||a — @||% is now replaced by any convex and continuous function 
Q: X — [0,co] where D(Q) := {a € X : Q(x) < co} is not empty and, even 
more, D(K)ND(Q) 4 0. Further assumptions on Q and K are needed to ensure 
the existence of minima of the Tikhonov functional 


Jo,s(t) := ||K(xz)—y®|[R + aQ(z), cE D(K)NDQ). (4.17) 


Here p > 1 is a fixed parameter. Instead of posing assumptions concerning 
the weak topology, we just make the same assumption as at the beginning of 
Subsection 4.2.2. 


Assumption 4.21 The Tikhonov functional Ja,5 possesses global minima ge 
D(K) A D(Q) on DK) ON D(Q) for all a,d > 0; that is, 


(a?) — 9? | + aQ(e®) < K(x) — yl + a Q(z) (4.18) 


for alla € DK) ND(Q). 


It remains to specify the metric in which we measure the error in x. As we will 
see in a moment the norm in X is not always the best possibility. To have more 
flexibility, we take any “measure function” F(a) which measures the distance of 
x to «*. We only require that E(x) > 0 for all x € X and E(«*) = 0. Then the 
variational source condition of Assumption 4.12 is generalized into the following 
form. 
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Assumption 4.22 (Variational Source condition) Let x* € D(K)ND(Q) be 
a solution of K(x) = y*. Furthermore, let dna = sup{||K(2*) — K(2)|ly : 
z € D(K)ND(Q)} and ¢ : [0,5max) > R be a concave index function (see 
Definition 4.10), and, for some constants 8 > 0 and p > 0 let the following 
estimate hold. 


BE(z) < Az) — A(a*) + v(|K(a*) — K(z)lly) 
for all x € M, := {4 € D(K)ND(Q) : A(x) < Q(2*) + p}. 
Theorem 4.23 Let Assumptions 4.21 and 4.22 hold. 
(a) Let « = a(d) be chosen such that 
oP oP 
= a) § 26) 


where cy > c_ > : are independent of 6 and where 61 € (0,dmax) ts 
chosen such that p(d1) < pc— (a priori choice), 


a(d) < cy for all 6 € (0,61) (4.19) 


(b) or assume that there exists ry, >r_ > 1 and a(d) > 0 such that 
rb < ||K(a%) —y? ly < 146 for all 5 € (0,61) (4.20) 
where 0, € (0,dmax) is arbitrary (a posteriori choice). 
Then x) € M, and 
Ble) = O(y(6)) and ||K(e%) —y"ly = O(8), 60. 


Proof: We follow almost exactly the proof of Theorem 4.14. Substituting 
x = x* into (4.18) yields 


(a?) — |B + a Q(a%?) < 5? + aQ(ca"*). (4.21) 
First we show that #%(): € M,. If a(d) is chosen as in (4.19) then 
OP 1 1 
Q a(6),6 Ze a 8) eT An") < —of(6 O(a" 
Ce < = (8) +a") < = ols) +2") 


which shows 2%(), € M, by the choice of 51. If a(d) is chosen by the discrep- 
ancy principle (4.20) then again from (4.21) 


(r? — 1) 8? + a(5) (a?) << a5) A(2*) 


and thus 0(2()°) < Q(ax*) because r_ > 1. Therefore, x9)? € My C My. 


Now we show the rates of convergence. Applying the variational source condi- 
tion we conclude from (4.21) that 


(a?) — yl + a8 Ea) < & + ay(|K(x*) —y* lly) 
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<P + ay(|K(x™) — y' lly +4). (4.22) 

If a = a(d) is chosen according to the discrepancy principle (4.20) then 
(2 196" + a(8) BB(2™) < a(8) o((rs + 1)8) 

and thus 8 E(x) < y((r4 +1)5). Now we use the elementary estimate 
(sd) < sy(d) for all s > 1 and 6 > 0 (see Lemma A.73 of Appendix A.8). 
Therefore, ae 
pE(er*) < (+1) ¢()- 
This proves the estimate for E(a*)-°). The estimate for ||K (2°) — y*||y 
follows obviously from the discrepancy inequality and the triangle inequality. 


Let now a = a(d) be given by (4.19). From (4.22), we obtain, using the upper 
estimate of a(d), 


|e?) —y | <P + ey ay Pil *)—y lly +). 
We set t = ||K (a>) — y°||y-/6 for abbreviation. Then the previous formula 
takes the form 


y(d(t+ 1)) 
9(5) 
where we used the estimate y(sd) < sp(d) for s > 1 again. Choose c > 0 
with c(c?-! —c,) > 1+c,. Then t < c. Indeed, if t > ¢ then t? — cyt = 
t(t?-! —c4) > c(e?-! —c4) > 14+ cy, a contradiction. This proves that 
|| K (a°)9) — y°|ly < cd. Now we substitute this into the right-hand side of 

(4.22), which yields 


P < 1 C+ 


<1+e@4+1) = +e.) +e¢ 


BEM) < T+ ylle+)d) < = (6) + (+1) 066). 


This ends the proof. 


We note that in the case of Hilbert spaces X and Y and Q(z) = ||a — @||% and 
E(x) = ||z — 2*||} Assumption 4.22 and Theorem 4.23 reduce to Assumption 
4.12 and Theorem 4.14, respectively. 


Before we continue with the general theory, we apply this theorem to the special 
situation to determine a sparse approximation of the linear problem Ka = y”. 
By sparse we mean that the solution x* € X can be expressed by only finitely 
many elements of a given basis of X. Therefore, let X be a Banach space 
having a Schauder basis {b; : 7 € N} with |/b;||.z¢ = 1 for all 7 € N; that is, 
every element z € X has a unique representation as a = > jen 270; where, of 
course, the convergence is understood in the norm of X. We define the subspace 
X CX by 


xX = = yah). la) <0 


jEN jEN 
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with the norm SC yen tabsllx = Vijen|2j| for « € X. Then X is bound- 
edly imbedded in X because ||xl|z = ||[Djentybille < Cyenlesllloslle = 
> jen |Z3l = ||c||x for « € X. Obviously, the space X is norm-isomorphic 
to the space ¢1 of sequences (z;) such that i |x;| converge, equipped with 
the canonical norm ||z||~ = ee |x; | for « = (x;). As the Schauder basis of 
(' we take {e) : 7 = 1,2,...} Cc @! where e” € ¢ is defined as el) = 0 for 
kA j and ey) =1. Therefore, we can take directly ¢' as the space X. We note 
that the dual of ¢1 is just (¢')* = £°, the space of bounded sequences with the 
sup-norm and the dual pairing? (y,2) po 1 = Djen ¥jX;j for y € €° and xe i, 
Also we note that ¢! itself is the dual of the space co of sequences converging 
to zeros (see Example A.21). Therefore, by Theorem A.77 of Appendix A.9 
the unit ball in ¢' is weak* compact which is an important ingredient to prove 
existence of minimizers of the Tikhonov functional. 


With these introductory remarks, we are able to show the following result where 
we followed the presentation in [96]. 

Theorem 4.24 Let Y be a Banach space and K : (' ++ Y be a linear bounded 
operator such that (u, Ke))y~ y + 0 as j - 00 for all u € Y* where (u,y)y~ 
denotes the application of 1 € Y* toy € Y. Let Kx* = y* andy® € Y with 
lly? — yr lly <6. 


(a) Let 2%? € ¢ be a minimizer of 
Jo,5(@) = ||Ke—y ll} + allalla, wee. 


Then x € £! is sparse; that is, the number of non-vanishing components 
ae # 0 is finite. 


(b) For every j € N let there exists f; € Y* with eY) = K*f; where K* : 
Y* + (¢')* = © is the dual operator corresponding to K. Define the 
function yp : [0,co) > R as 


y(t) := 2 inf [int + > ke where Y, = sup 
neN oh s;€{0,1,—1} 


n 
So ssf; 
j=l 


Then is a concave index function. With the choices (4.19) or (4.20) of 
a = a(d) the following convergence rates hold: 


|]x%69)»8 = x* le =~ O(9(6)) : || Kae(9),9 _ y*|ly = O(6) 


y* 


as 6 tends to zero. 


(c) If z* € £ is such that pl 17 |z}| < co for some o > 0 then we have 


[[x0? — aX = O(67/C4+)), 6 0. 


3Note that we denote the dual pairing by (@,2)x* x = &(x) for £€ X* anda € X. The 
mapping (f,2) +> (¢,@)x~ x is bilinear. 
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If x* is sparse or if (Ym) is bounded then we have ||x%)-> — x* ||. = O(6) 
as 0 tends to zero. 


Proof: (a) Set z = Kx®* —y? for abbreviation. The optimality of 2 reads 
as 


jz + KAP — |lzlf = -al |x? + Alla — Iz? |e. | for allhe é!. (4.23) 
Define the sets A,B CR x Y as follows: 


A = {(ry)ERxYir>llz+yllh - lel}, 
B= {(r,Kh)ERXY:hel, r<—-alllx%? + hla — lz lla]}- 


Then it is not difficult to show that A and B are convex, A is open, and ANB = @ 
because of (4.23). Now we apply the separation theorem for convex sets (see 
Theorem A.69 of Appendix A.8). There exists (s,) € R x Y* and y € R such 
that (s, 4) 4 (0,0) and 


srt (uy)y*y >y> sr’ + (uy, Kh)y«y for all (r,y) € A and (r’, Kh) € B. 


Letting r tend to infinity while keeping the other variables constant yields s > 0. 
It is s # 0 because otherwise we would have (1,y)y*y > y for all y € Y 
(set r := |lz + yl} — lz} + 1) which would yield that also y vanishes’, a 
contradiction. Therefore, s > 0, and without loss of generality, s = 1. Now 
we set y = 0 and fix any h € ¢' and let r tend to zero from above and set 
r = —al|lx*? + Alla — ||c%* ||]. This yields the inequality 


0 > -allle™? + hile — lela] + (u, Kh)ysy 
for all h € ¢!. For any t € R and k EN, we set h = te) and arrive at 
o|agy +t] — log? | — t(u,Ke™)y-y > 0. 


For fixed k with a # 0 we choose |t| so small such that sign (xe? +t) = 
sign (ae). Then the previous inequality reads as 


tla sign (ae) = (u, Keys y] > 0. 


Choosing t > 0 and t < 0 yields a sign(2”’) = (u,Ke)y«y; that is, 
|(u, Ke )y«y| =a. This holds for every k € J:= {kEN: oo? & Oh. 
This implies that J is finite because of (u, Ke))y- y > 0 as k > 00. 


(b) We apply Theorem 4.23 and have to verify Assumptions 4.21 and 4.22 for 
the special case D(K) = D(Q) = X = ¢1, E(z) = |x —2x*|/n, and Q(z) = ||| 
for « € X = ¢'. First we show that y is a concave index function. Indeed, 
y is continuous, monotonic, and concave as the infimum of affine functions 


4the reader should prove this himself. 
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(see Problem 4.6). Furthermore, y(0) = 2 inf ~j>n|tj| = 0 which shows 


that y is a concave index function. Assumption 4.21; that is, existence of a 
minimum of J,5 can be shown using results on weak- and weak»-topologies. 
(The assumption that (y*, Ae“))y» y tends to zero for all y* is equivalent to 
the weakx-weak continuity of K. Then one uses that || - ||?- and || - ||¢: are lower 
weak semi-continuous and lower weak* semi-continuous, respectively.) We do 
not carry out this part but refer to, e.g., [96]. Assumption 4.22 is therefore 
equivalent to (with @ = 1) 


< y(|K(z—2*)|ly) for allaeé. (4.24) 


lla — a" lex — fall + |la*lle 


To prove this we have for any n € N 


n 


= So s;(fj,K(@-2"))y-y < 


j=l 
< W||\K(e-2"*)lly 


y* 


where sj; = sign(x; — z¥) € {0,1,—1}. Therefore, with |x;| > |x}|— |x; — «|, 


n n 
Yi lles— 23l— lest eel] < 20 le) - 23] < 2ml|K@-=2"*)ly. 
j=l j=l 
Furthermore, 

So [lacs — 2$| — larg] + la$]] < 250 [251; 

jon jon 
that is, 


lIz7—a* lla — [jell + |le*lle < 2)mllK(@—a*)Ily + Do lezl] (4.25) 


jon 
This shows (4.24) since this estimate holds for all n € N. Therefore, all of the 
assumptions of Theorem 4.23 are satisfied, and the error estimates are shown. 
(c) We estimate y(t). First we observe that (7,) is monotonically increasing. If 
Yn is bounded then it converges to some finite y € R. Therefore, we let n tend 
to infinity in the definition of y and arrive at y(t) = 27t. We consider now the 
case that 7, tends to infinity. First we estimate 


* Cc 
mt + Dhl < met S- Lv bil < mt + o 
j>n Vn jon Yn+1 


with c= 0°, 7f |x%|. For sufficiently small t, the index 


1 
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is well-defined and finite. Then yng) < t+? and Yniygr > MOF. 
Therefore, 
< (beer), 


et) < mwt + = 
Yn (t)+1 


Finally, if x* is sparse then there exists n € N such that xj = 0 for all 7 > n. 
For that n the series }),..,, |vj| in (4.25) vanishes which shows the result for the 
linear index function y(t) = 2ynt. 


Remark: The reciprocals 1/7, play the role of the singular values in the case 
of Hilbert spaces X = ¢? and Y with a singular system {is cD) gs 5 € Jt. 
Indeed, the assumption e) = K* f; is satisfied with f; = — 1 g(a ) and for yp one 
has the form 


ns “1 
5 j 
y, = sup 85 f; = sup 2 = 2 
s3€{0,1,—1} N15 » _ cht nD M5 yr 


The condition that S>° j=1 7j |€j| converges corresponds to the source condition 
a* € R((A*A)7/?). 


We go now back to the general case studied in Theorem 4.23. We wish to 
carry over Theorem 4.13 which proves the variational source condition from the 
classical one of Assumption 4.11. The essential formula used in the proof of 
part (a) of Theorem 4.13 was based on the binomial formula; that is, 


* 


IIe — alk = [a-ak — lz" — alk — 2(@-2*,a* —4)x. 
If we denote the penalty term by (x); that is, Q(x) = ||a — 4|/% then we can 
write this formula as 


jo" — al = B%(x,2") = A(x) — Ale") — (W(a"), 2-2") xox 


where ()'(a*) : X — R is the Fréchet derivative of N at x* and (€,z)x+,x is 
the application of 2 € X* to z € X. The function B°(x,2*) is the famous 
Bregman distance corresponding to the function 2. This can be extended to 
Banach spaces because for convex and differentiable functions 2 from a Banach 
space X into R the function 


B° (a, a") = Of2) — Oa") — (O'(e"),2-—a*\x+x, wEX, 


is nonnegative on X (see Lemma A.70 of Appendix A.8). If 2 is even strictly 
convex, then B°(z,2*) = 0 6 « = a*. If OQ: X > R is convex and only 
continuous then the subdifferential OQ(x*) C X*; that is, set of subgradients is 
non-empty (see Lemma A.72 of Appendix A.8). We recall that 0Q(x2*) C X™* is 
the set of all €€ X* with 


O(a) — Q(a*) - ae >0 forallae Xx. 


If Q is convex and differentiable then 02(a*) = {Q'(«*)} (Lemma A.72). There- 
fore, we formulate the following pete 
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Definition 4.25 Let X be a normed space, A C X convex and open, and Q: 
A—R conver and continuous with subdifferential OQ(x*) at some «* € A. For 
 € OQ(ax*) the Bregman distance is defined as 


BP (a, a*) := Oz) — O(e*) — (a2-—a*)x- x, ZEA. 


We note that the Bregman distance depends on the function 2. It mea- 
sures the defect of the function with its linearization at 7*. With the Bregman 
distance as E(x) we have an analogue of Theorem 4.13. 


Lemma 4.26 Let x* € D(K)M D(Q) be a solution of K(x) = y* and let yp: 
[(0,co) > R be a concave index function. Furthermore, let Q: X — R be convex 
and continuous and € € 0Q(a*) and E(x) = BP (ax, x*). 


(a) Then, for this particular choice of E(x), the variational source condition 
of Assumption 4.22 is equivalent to the following condition: 
There exists p >0 and 0 <a <1 such that 


(f,a*—a2)x+,x < o BY(z,2*) + y(\|K(2*)-K(a)lly) (4.26) 
for all x € Mo. 


(b) Assume that there exist w € Y* and a concave index function n such that 
= K'(a*)*w and 
|K'(a*)(«@—2*)|ly < n(|K(x*)—K(a)lly) for alla e M,. 


Then Assumption 4.22 holds with y(t) = ||w|ly+ n(t) for t > 0. 


Proof: (a) This follows directly from the definitions of E(a) and BY’(a,2*). 
Indeed, the estimate in Assumption 4.22 reads as 


BBP(a,a*) < Q(z) — AZ") + ¢ol|K(2*) - K(a)Illy); 
that is, 
BBY(z,2*) < BP(a,a*) + (@2-2")x-,x + 9(l|K(2*) — K(2)lly) 
which is equivalent to 
(é,2* —a)x+x < (1-8) BP(2,2*) + 9(||K(e*) — K(@)lly). 


This proves part (a) with o =1— if 6 <1 and o = 0 otherwise. 
(b) If = K’(a*)*w then 


(€,2* —a\)x« x = (K'(2*)*w,2* —2)x+ x = (w, K'(2*)(2* —2))y+y 
K'@’)@* — 2)lly < llully« n(|K(2*) — K(@)lly) - 


This proves the condition of (a) with o = 0; that is, @ = 1. 


IA 


Ilwly- 


Combining Theorem 4.23 with these particular choices, we have the following 
theorem: 
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Theorem 4.27 Let Assumption 4.21 hold and let x* € D(K)ND(Q) be a solu- 
tion of K(x) = y*. Furthermore, let Smax = sup{||K(x*) — K(a)|ly : @ € 
D(K)ND(Q)} and y : [0, bmax) > R be a concave index function, and for some 
constants p > 0 and0 < o <1, let the source condition (4.26) hold for some 
é € OQ(a*). Let a = a(d) be chosen according to (4.19) or (4.20). Then we 


have the error estimates 
BP (x, 2*) = O((5)) and |K(x%*) —y*\ly = O(6) 


as 6 tends to zero. 


As a particular and obviously important example, we now take 
O(z) = |l*-a|,, 2ex, 


for some p > 1. Then one would like to characterize the Bregman distance — 
or, at least, construct lower bounds of Bf’(x,x*) in terms of ||a — x*||. This 
leads to the concept of p—convex Banach spaces. 


Definition 4.28 A Banach space X is called p—convex for some p > 0 if there 
exists c > 0 such that 


let yllk — Welk — (ley)x-,x 2 ellullk for all lz € (ll - lk) (x) 


and x,y EX. 


In other words, for p—convex spaces the Bregman distance Be (a, z) correspond- 
ing to Q(x) = ||z — 2||K can be bounded below by c||x — z||& for all x,z € X. 
Therefore, if the assumptions of the previous Theorem 4.27 holds one has the 
rate |x). — z*|| = O(y(6)). 


As a particular example, we show that L?(D) are p—convex for all p > 2. 


Lemma 4.29 Let p > 1 and D C R” open and 
fe) = [elie = felt dt for xe LD). 
D 
(a) Then f is differentiable and 


f(xy = r [vio |x(t)|P" signa(t)dt for x,y € L(D). 
D 


(b) Let p> 2. Then there exists c, > 0 with 


fe+y) — fa) — fey > epliyllrcp) for all 2,y € L?(D). 


5 Actually, the classical definition uses the dual mapping instead of the subdifferential. 
However, by Asplund’s theorem (see [48]) they are equivalent. 
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Proof: (a) First we observe that the integral for f’(x)y exists by Hélder’s 
inequality. Indeed, set r = = and s =p then + + 4 = 1 and thus 


l/r 
/ la(t)|"@-Dat ‘] iy(t) [Pat 
D D 


(p—1)/p 
. / a(t) [Pat / ly(é) Pat 
D D 


-1 
le toyllallzocay - 


1/s 


IA 


/ le(#)/P|y(e)| at 
D 


1/p 


I 


We use the following elementary estimate. There exist constants cy, > 0 and 
Cp > 0 with cy, = 0 for p < 2 and cp > 0 for p > 2 such that for all z€ R 


cpl2l? < [L+2|?-1—pz < 


ae if p <2or |z| > 5, en 


cy |z)? ifp>2and |z| <5. 


We give a proof of this estimate in Lemma A.74 for the convenience of the 
reader. 


Therefore, for any x,y € R with x 4 0, we have 
Pp 
E+ —|x\F —py|x\signz = |x po) =l=p= 
yl? — lal? P-1 gig Pit . 1 > 0 


and 


p P p-le = p y |? y 
jc + yl? — |2 —py|a|P signe = |e)? |\l+ 7) —l—pe 


cx lelPly/al? = cy ly? fp <2 or Alyl > lal, 
< 
c+ |alPly/al? = cy |xlP2Iy2 if p> 2 and yl < al. 


This holds obviously also for s = 0. Now we apply this to 2(t) and y(t) with 
x,y € L(D). This shows already that 


[le + > _ |n(t)|? — py(t) |x(2)|"- sign a(t)] dt > 0. 
D 


Next we show that 


[|(e) + y(t)? — |x(®)|? — py(t) |a(@)|? signa(t)] dt < ellyll thy” 
D 


for ||y||z»(p) < 1. This would finish the proof of part (a) because p > 1. 
For proving this estimate we define T := {t € D : 2|y(t)| > |x(t)|}. Then 


[lec + uo? - |x(¢)|” — py(t) |x(2)|” sign x(t) dt < cxf y(t)Pat 
T 


T 
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and 
) |x) + y()|? — |x) |? — py(t) |2(t)|? sign x(t) dt 
D\T 
ce f yyra tps, 
ws D\T 
my es . lx(t)|P~2|y(t)[2dt if p>2. 
D\T 


If p < 2 we just add the two estimates and have shown the estimate 


<a 
[lle© +vOP leo? -puO OP signet at < eslylacoy: 
D 
If p > 2 we apply Hélder’s inequality to the integral Jp |a(t)|?~?| y(t) |? dt. 
Indeed, set r = =P and s = § then ++ 1=1 and thus 


/ l/s 
/ le(t)|?-2[y(t) 2a < ( | tora) / iy(t) [eat 
D\T cd D\T 
(p—2)/p 2/p 
= / ln(t)|Pat / ly(t) Pat 
D\T D\T 


lel te oyllullZecy - 


IA 


Therefore, 


I A 
[ile + y()/? — le@®|? — py(t) |2@)|? sign x(t) at 
D 
< exllyllgepp) + e+ lleleecpyllyllzecvy - 


For p > 2 and ||yl|z»(p) < 1 we have ||y||Ep(p) < llyllZ»(p) and thus 
=e 
[lew +y(t)|? — |x(t)|? — py(t) |a()/P~ signe (t) dt < ellyllz~ay- 
D 


This proves part (a). 
(b) By part (a) we have to show that 


J [lec + uel? — fa)? — pul) [aC |?* signe) ~ e, [u(ol?] at > 0 


D 
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for all z,y € L?(D). We show that the integrand is non-negative. Indeed, we 
have 


P Pp 
je + yl” — la” — py lal?" sign a — cplyl? = [el [l1 + 2" -1-p2 -c|2|] 


and this is nonnegative by (4.27). 


By the same method, it can be shown that also the spaces £? of sequences and 
the Sobolev spaces W™?(D) are p—convex for p > 2 and any m € No. 


4.3 The Nonlinear Landweber Iteration 


As a general criticism towards Tikhonov’s method in the nonlinear case, we 
mention the disadvantage that the convergence results hold only for the global 
minima of the Tikhonov functional J.,5 which is, in general, a non-convex func- 
tion. In the best of all cases, the global minima can only be computed by 
iterative methods®. Therefore, it seems to be natural to solve the nonlinear 
equation K(x) = y® directly by an iterative scheme such as Newton-type meth- 
ods or Landweber methods. In this section, we present the simplest of such an 
algorithm and follow the presentation of the paper [126] and also the monograph 
[149]. 


In this section, let again X and Y be Hilbert spaces and K: X DD(K) > Ya 
continuously Fréchet differentiable mapping from the open domain of definition 
D(K) and let  € D(K). 

First we recall the nonlinear Tikhonov functional 


Jo,5(x) = K(x) — yl} + alle—allk, te D(K), 


for a,6 > 0. This functional is differentiable. 
Lemma 4.30 Jy,5 is differentiable in every x* € D(K) and 

‘e(@")h = 2Re(K(x*)—y’, K'(a")h)y + 2aRe(x*—-B,h)x, hex. 
Proof: Wehave K(a*+h) = K(a*)+K'(a*)h+r(h) with ||r(h)||y /||Allx 3 0 
for h + 0. Using the binomial formula ||a + 6]|? = ||a||? + 2Re(a, b) + |lb||? we 
have 
Jo,(a* +h) = ||\K(a* +h) — yl] + alla* +h allx 
I|(K(@*) — 9°) + (K"(a*)h + r(h)) I} + all(a* — 8) + All 
= ||K(a*) —y ll} +2Re(K(a*) —y’, K'(a*)h + r(h)), 
+ ||K'(a"*)h + (Ally + alla* — alk + 2a Re(a* — @,h)x 
+ alll 
= Jo,5(v")+2Re(K (@")—-y?, K’(x*)h) | +2aRe(e* — #,h)x + F(h) 


I 


6Fven this is only possible in very special cases. Usually, one must be satisfied with critical 
points of Ja,6 
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F(h) = 2Re(K(a*)—y°,r(h))y + |K'(a")h+r(Ajlly + alllhll, 


thus limp_59 7(h)/||hl|x = 0. 


Lemma 4.31 If x* € D(Kx) is a local minimum of Jo,5 then Ji, 5(x*)h = 0 for 
allh € X, thus Ray kG" - y) +a(z* —%)=0. 


Proof: Let h £0 fixed. For sufficiently small t > 0 the point «* + th belongs 
to D(K) because D(ix) is open. Therefore, for sufficiently small t > 0 the 
optimality of x* implies |Jg,5(a* + th) — Ja,5(x*)] /||th||x > 0, thus 


a 


Ju,s(° + th) ~ Joys(@*) ~ Te,ol@*\(th) | Se,sl@"Yh 
~ [thllx lallx 


The inequality J{, ;(z*)h > 0 follows as t tends to zero. This implies the asser- 
tion since this holds for all h. 


The equation 
K'(2*)*(K(2*) —y°) +a(z*—#) = 0 (4.28) 


reduces to the well known normal equation (2.16) in the case that K is linear 
and # = 0 because the derivative is given by K’(a*) = K. However, in general, 
this equation is nonlinear, and one has to apply an iterative method for solving 
this equation. Instead of solving the well posed nonlinear equation (4.28), one 
can directly use an iterative scheme to solve—and regularize—the unregularized 
equation K’(x*)*(K(a*) — y®) = 0, which can be written as 


eo = vt a K'(x*)*(K(2*) —y’) 


with an arbitrary number a > 0. This is a fixed point equation, and it is natural 
to solve it iteratively by the fixed point iteration: Fix @ € D(K). Set 22 = @ 
and 

ea = xy _ a K'(ap)* (K (xg) —y°), k=0,1,2,... 
This is called the nonlinear Landweber iteration because it reduces to the well 
known Landweber iteration in the linear case, see Sections 2.3 and 2.6. 
At the moment, it is not clear that it is well-defined; that is, that all of the 
iterates lie in the domain of definition D(K). 
We choose p > 0 and a > 0 such that al|K’(2)|[Z(,¥) < 1 for all 2 € B(@, p). 
Then we scale the equation K(x) = y°; that is, we replace it by the equivalent 
equation K(x) = 9° with K = Yak and 9 = Vay’. Then ||K"(2)|[Z¢x,y) <1 
for all x € B(%,p), and the Landweber iteration takes the form 


ce = oH _ K' (at)*(K (a8) — 9°), k=0,1,2,... 
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Therefore, we assume from now on that a = 1 and ||K’(x)||ccx,y) < 1 for all 
x € B(&,p). The Landweber iteration thus takes the form 


ea, tig =a) = Kel Rea y), k=O 42. (429) 


The following, rather strong, assumption is called the “tangential cone condi- 
tion” and will ensure that the Landweber iteration is well-defined and will also 
provide convergence. 


Assumption 4.32 (Tangential Cone Condition) 


Let B&, p)CD(K) be some ball, K differentiable in B(@, p) with ||K'(x)||ccx,v) <1 
for allx € B(&,p). Furthermore, let K' be Lipschitz continuous on B(&, p), and 
there exists n < $ with 


||K(@) — K(z) — K'(x)(@-2)||) < nl|K(@) — K(2)lly (4.30) 


for all x,% € B(&, p). 


We compare this estimate with the estimate 


||K(@) — K(x) — K'(«)(@—-2)||, < ll@—allk, 

which holds for Lipschitz-continuously differentiable functions (see Lemma A.63). 
If there exists c’ with ||% — 2||x < c'||K(<) — K(x)|ly then (4.30) follows. How- 
ever, such an estimate does not hold for ill-posed problems (see Definition 4.1 
and Problem 4.1). The condition (4.30) is very strong but in some cases it 
can be verified (see Problem 4.7). First we draw some conclusions from this 
condition. The first one is a basic inequality which will be used quite often in 
the following. 


Corollary 4.33 Under Assumption 4.32, the following holds: 


IK OE-aly < IK@-KO@lly < —IK'@E-a)lv 431) 


for all x,z € B(&, p). 


Proof: The right estimate follows from 


|K(z) — K(z)|ly < ||K(@)— K(z)— K"(x)(@—2)|ly + ||K'(@)(@- 2) lly 
< n||K(@)—K(z)lly + |K"(2)(@-2)lly, 
and the left from 
|K(@)— K(2)|ly = ||K'(2)(@- 2) — [K'(2)(@- 2) — K(@) + K(2)]||, 
> ||K"(x)\(@--2)\ly — n||K(#) - K(2)|ly . 
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With this lemma, we try to justify the notion ” tangential cone condition” and 
define the convex cones C'x(#) in R x X with vertex at (0,z) as 


Cx(x) = f(r) ERxX:r> rEqlK' eye ahr}. 


Then (4.31) can be formulated as (||K(%) — K(z)|ly,@) € C(z) \ int C_(2) 
for every «,% € B(%,p). Therefore, for every x € B(#,p) the graph of % 4H 
|| (z) — K(a)||y lies between the cones C;(x) and C_(a) which are build by 
the tangent space at x. 

We draw two conclusions from this corollary. Under Assumption 4.32, the 
reverse of Theorem 4.3 holds, and there exists a unique minimum norm solution 
with respect to Z. 


Definition 4.34 Let K : X D D(K) > Y be a nonlinear mapping, y* € Y, 
and & € X. A point c* € DUK) with K(x2*) = y* ts called minimum norm 
solution with respect to if ||a* — |x < ||x — &||x for all solutions x € D(K) 
of K(x) = y"*. 

Lemma 4.35 Let Assumption 4.82 hold for some ball B(&,p) and let «* € 
B(&,p) with K(a*) =y". 


(a) Let the linear equation K'(a*)h = 0 be ill-posed in the sense of Defini- 
tion 4.1. Then the nonlinear equation K(x) = y is locally ill-posed in 


a. 


(b) x* is a minimum norm solution with respect to & if, and only if, x* —&% 1 


(c) There exists a unique minimum norm solution at of K(x) = y* with 
respect to &. 


Proof: (a) Let r € (0, p — ||z* — &||x). Since the linear equation is ill-posed 
there exists a sequence h, € X with ||h,||x = 1 and ||K’(a*)h,||y > 0 for 
n— oo. We set @, = 2* +rh,. For & = «* and x = x» in (4.31) it follows that 
K (apn) > K(a*) and |la, — x*||x =r. 

(b) For the characterization of a minimum norm solution it suffices to compare 
||ja* —4||x with ||Z — @||x for solutions % € B(%, p) of K(%) = y*. The estimate 
(4.31) for x = x* implies that & € B(%, p) is a solution of K(Z) = y* if, and only 
if, K'(x*)(a*—2) = 0; that is, if 2*—z € N(K'(2*)). The point z* is a minimum 
norm solution with respect to if, and only if, ||z* — &||x < ||Z — @||x for all 
solutions 7 € B(#, p) of K(%) = y*; that is, for all # with «* — @ € N(K'(2*)). 
Replacing «* — % by z shows that x* is a minimum norm solution with respect 
to & if, and only if, 0 is the best approximation of «* — # in N(K'(2*)) which 
is equivalent to a* — @ L N(K’'(a*)). 

(c) The subspace N’(K’(2*)) is closed. Therefore, there exists a unique best 
approximation p of 2* —@ in N(K’(2*)).” Then at = 2* —p € 2* +N(K'(2*)) 


7This follows from a general theorem on best approximations in Hilbert spaces. 
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is the best approximation at @ in 2* +.N(K’(a*)). Also, K(a') = y* because 
a* —at =p € N(K'(2*)). Finally, |[zt — @||x < ||x* — 4\|x < p, thus zt e€ 
B(2, p). 


The following result proves that the Landweber iteration is well-defined, and it 
motivates a stopping rule. 


Theorem 4.36 Let Assumption 4.32 hold. Let ||y° — y*|ly < 6 and x* € 


a . *\) __ pk 2(1+7) 
B(&,p/2) with K(a*) = y*. Furthermore, let r > ae and assume that 


there exists kx © N with a8 € B(x*,p/2) for allk =0,...,kx —1 (where x? are 
defined by (4.29)) such that 


|K (28) —y\ly > rd for allk =0,...,k.—1. (4.32) 

Then the following holds. 
(a) ||a84. — 2*\_x < ||x2 — 2*\|x for all k = 0,...,k —1. In particular, 
|2o — x*\lx < ||23 — 2*\|_x = ||@-2*\|_x < p/2 for all k = 0,..., ky. 


Therefore, all x? belong to B(x*, p/2) C B(#,p) fork =0,..., kx. 
(b) For all £ € {0,...,k, —1} it holds that 


(1-2-2049) 3 I<(0f) v1 < eta" — lak, —2" Ik. 
(4.33) 
(c) If 5 =0 then (4.32) holds for allk EN, and (we write xz instead of x?) 
co a * || 2 
_ wep, « lecetlk | 
Slike) ve < SI (4.34) 


In particular, K(x,) converges to y*. 
Proof: Since ||.K’(x?)*\|ccv.x) < 1 we have for k =0,...,k. —1 
lIeRaa — 271 — lek — 2° 
= 2Re(ap—2*,xe41—2)x + [later — cell 
= 2Re(xp — 2°, K'(ah)"(y’ — K(@h))) x + KDW — K@DIMlx 
S 2Re(K'(x4) (xg — 2*),y° — K(2R)) x + lly? — K@eyIy 
= 2 Re(K'(a})(a — 2*) — K (ag) + 9°, y° — K(#h)) x — lly? — K(2?) 
< [2n||K (az) — "lly + 26 — ||K(@}) — 9° lly] IK(@2) — 9’ lly 
<  [(2n — 1) ||K (ah) — y° lly + 26 + 26] || (ae) — y? lly 
= [25(1 + m) — (1 — 2n) K(x) — 9? lly] IK (ag) — 9’ lly (4.35) 
[2(1 + n)dr — (1 — 2n)r | (xp) — 9? lly] (we) — 9° lly 


(1+) —- (1 — 2n)r 


NO zle 


IA 


A(R) — yl 
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where we used (4.32) in the last estimate. This implies (a), because 2(1 + 7) — 
(1 — 2n)r < 0 by the choice of r. 


With a = [(1 — 2n)r — 2(1+)|/r > 0 we just have shown the estimate 
a||K(xg) — yl < lek — 2" — leh — 27k, k= 0,...,he —1, 
and thus for £ € {0,...,k, — 1}: 


ky—-1 
6 6 0) 6 
a>) IK(ae) — 9° < Mee — 2° — llak, - 7k 


(c) We consider the above estimate up to line (4.35) for 6 = 0; that is, 


(1 —2n) ||K(ae) —y" lly < lee -— 2° — |lee+1 — 2", 


and thus for all m EN 


m—- 1 
(1—2n) D0 |K (ae) — 9" < llto- 2" — llem— 2" Ik < lo — 2" | 
k=0 


Therefore the series converges which yields the desired estimate. 


Let now 6 > 0 and again r > 20D The condition (4.32) can not hold for 


every k,. Indeed, otherwise the sequence (|| —2* || x) would be a monotonically 
decreasing and bounded sequence, thus convergent. From (4.33) for =k, — 1 
it would follows that 


(129-2?) hac(ah,_.)—v' I < llek_a-2'IR - eh -2" Ie 0 


for k, — oo, a contradiction. Therefore, the following stopping rule is well- 
defined. 

Stopping rule: Let 6 > 0 and r > 7a We define k, = k,(6) € N as the 
uniquely determined number such hat 


|K (x2) —y* lly > ré > ||K(@2,) -—y'lly for all k=0,...,4&.—1. (4.36) 


With this stopping rule, we can show convergence as 6 + 0. First we note that 
for any fixed 6 > 0 and k < k,(d), the mapping y° ++ x? is continuous. Indeed, 
in every of the first & steps of the algorithm only continuous operations of the 
Landweber iteration are performed. 


We first consider the case of no noise; that is, 6 = 0. 


Theorem 4.37 Let Assumption 4.32 hold and let p > 0 and x* € B(%,p/2) 
with K(a*) = y*. The Landweber iteration is well defined for 6 = 0 (that is, 
the iterates x; belong to D(K)) and the sequence (x) converges to a solution 
# € Bla", p/2] C BCG, p) of K(x) =y 
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Proof: | We have seen already in Theorem 4.36 that 7, € B(x*, p/2) for all k. 
We now show that (2,) is a Cauchy sequence. Let £ < m be fixed. Determine 
k with €<k<m and 


|K(xe)—y"lly < |K(ai)—y"lly for all €<ism. 


Because of ||a~¢ — &m||? < 2||ae — xg||? + 2||a~ — 2m? we estimate both terms 
separately and use the formula ||u — v||? + |u|]? — |v]? = 2 Re(u, w— v) and the 
left estimate of (4.31) 


lle — tml + lle — 2° — lam — 2% 
m—-1 
= 2 Re(z, — 2*, 2% — tm) x = 2 ES Re(a,. — 2", a4 — 2141) x 
4=F 
m—-1 
=2 S- Re(xx =e" Ke) AG) =) 
i=k 


m—1 


=a ice ‘iG. ="), Bla) —y"), 


m—1 

<2 S0 Ki) — y" lly 1K’ (a:)(e — 2" )Ily 
i=k 
m—1 


<2 $0 |K (wi) — y" lly [IK (@i)(@i — 2*)Ily + |K"(@i)(@i — 2x )Ily] 


ix=k 


< 2(1 +n) Si || (xi) — y* lly [IK (ei) — K(@*)|ly + || K (ai) — K (er) lly] 
i=k 


< 2(.+n) > || (xi)—y" lly [AK (@:)—ylly + ||K(:)—y" lly + lly* — K(@e)lly] 
i=k 


m—1 


< 20+) SO [IK (@s)—y" lly [LK @i)—y" ly +11 K(@i)—9" lly + Ily* — K(wa)lly] 
ix=k 


m—1 
= 6(1+7) SO K(ai) — 9" ll} - 
i=k 


Analogously, we estimate 


k-1 
Ilex — well + lee — 27 (I —|lee— 2" < 6 +n) D0 Kai) — 9" Il} 
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and thus 
Ize -— aml << Qllwe — well + lle — emllX 
< QAllwe — 2* || — Alex —x*||% 4 12(1 +n) S> ||K (ai) — 9" Il} 
m—-1 
+2||am — 2" || — Ile, — a7 || + 121147) s |K (wi) -— oI} 
i=k 


= Alle — a*|[x + 2llem — 2*|[x — Allen — 2° || 
m—1 


+ 12(1 +7) 2 | K(ai) — "||? - 


Since both, the sequence (||2¢—2*||x), and also the series )7?°9 || K (xi) —y" ||}, 
converge by Theorem 4.36 we conclude that also ||z~: — xm||% converges to 
zero as £ — oo. Therefore, (%) is a Cauchy sequence and thus convergent; 
that is, %, — & for some & € Blia*,p/2]. The continuity of K implies that 
K(z) = y". 


Now we prove convergence of oe (5) as 6 — 0 if &,(0) is determined by the 
stopping rule. 


Theorem 4.38 Let Assumption 4.32 hold and let p > 0 and x* € B(%,p/2) 

with K(a*) = y* andr > 2D Then the sequence stops by the stopping rule 

(4.36) and defines k,.(6) for ae >0. Then lim ay (5) = © where & € Blax*, p/2] 
sage 


is the limit of the sequence for 6 = 0 — which exists by the previous theorem and 
is a solution of K(&) = y*. 


Proof: Let (a) be the sequence of the Landweber iteration for 6 = 0; that 
is, vy = x2, and & = limg4.00 2%. Furthermore, let (6,) be a sequence which 
converges to zero. Set for abbreviation ky, = k.(d,). We distinguish between 
two cases. 


Case 1: The sequence (k,,) of natural numbers has finite accumulation points. 
Let k € N be the smallest accumulation point. Then there exists I Cc N of 
infinite cardinality such that k, = k for all n € I. By the definition of k,, we 
have 


IK (aon) — y® lly < rob, foralnel. 


Since for this fixed k, the iteration aon depends continuously on y®”, and since 
y°” converges to y*, we conclude for n € I, n — oo, that ae + xp (this is the 
k-th iteration of the sequence for 6 = 0) and K(a,) = y*. Landweber’s iteration 
for 6 = 0 implies that then x, = xz for all m > k; that is, the sequence is 
constant for m > k. In particular, 7, = x for all m > k. The same a 


holds for any other accumulation point k >k and thus lim os (6) = Lp_=L. 
noo nel o 
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Case 2: The sequence (k,,), tends to infinity. Without loss of generality, we 
assume that k,, converges monotonically to infinity. Let n > m. We apply 
Theorem 4.36 for @ instead of x* and get 


on ~ Sn - bn. is 
lz —-2lx = la, -@lx SS laf —2llx 


IA 


lee" —a2enllx + [een — Ellx- 


Let ¢€ > 0. Choose m such that ||x,,, — Z||x < ¢/2. For this fixed m, the 
sequence ae converges to xz, for n — oo. Therefore, we can find no with 
Ilo" — Lk, \|x < ¢/2 for alln > no. For these n > no, we have llagr —&\|x <e. 
Therefore, we have shown convergence of every sequence J, — 0 to the same 
limit z. This ends the proof. 


Before we prove a result on the order of convergence, we formulate a condition 
under which the Landweber iteration converges to the minimum norm solution. 


Lemma 4.39 Let Assumption 4.82 hold and let x* € B(%, p/2) be the minimum 
norm solution of K(x) = y* with respect to &. (It exists and is unique by 
Lemma 4.35.) Let, furthermore, N(K'(2*)) C N(K'(a)) for all x € B(&, p/2). 
Then the sequences (x,) and Ce of the Landweber iteration converge to 
the minimum norm solution x* of K(x) = y*. 


Proof: Convergence of the sequences has been shown in Theorems 4.37 and 
4.38 already. Let © = limg_,.. xz. Then also & = lims_,9 45) by Theorem 4.38. 
We show by induction that x, —a* 1 N(K’(2*)) for all k =0,1,.... Fork =0 
this is true because zp = &, the minimum property of x*, and part (b) of 
Lemma 4.35. Let the assertion be true for k and let z € N(K’(x*)). Then 
z € N(K'(ax)), thus 


(tee1 — 2", 2z)x = (ap-—2*,2z)x + (K (zx) — y", K'(xr)z)) = 0. 
Since the orthogonal complement of M’(K’(x*)) is closed we conclude that also 


&—a* | N(K’'(x*)). Furthermore, %— 2* € N(K'(x*)) because of (4.31) for 
x =x*. Therefore, t = 2*. 


We will now prove a rate of convergence. This part will be a bit more technical. 
It is not surprising that we need the source condition of Assumption 4.11 just as 
in the linear case or for the nonlinear Tikhonov regularization. Unfortunately, 
Assumptions 4.32 and 4.11 are not sufficient, and we have to strengthen them. 


Assumption 4.40 Let Assumption 4.82 hold and x* € B(%, p) be the minimum 
norm solution of K(x) = y* with respect to %. (It exists by Lemma 4.35). 
Furthermore, let K'(a*) compact and there exists C > 0 and a family {Ry : au € 
B(&, p)} of linear bounded operators Ry, : Y + Y with 


K'(z) = R,K'(2*) and ||Re—-I|lecvvy) < C|lz-—2*||x for x € B(4,p). 
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In the linear case, both conditions are satisfied for R, = I because K’(x) = 
K for all x. Under this additional assumption, we can partially sharpen the 
tangential cone condition of Assumption 4.32. 


Lemma 4.41 Under Assumption 4.40, we have N(K'(a*)) C N(K'(2)) for 
all x € B(&,p) and 


x 


||K(@) — K(#") — K"(2"*)(@ — 2") ||) Cle — 2*Ihx IK’ (a*)@ — 2D lly, (4.37) 


|(@)-K@")—K'@\(e—2 Vy < Selle 2" IK@)— Key (4.38) 


S 
2 


for all x € B(&,p). Therefore, if we choose p such that n := yon < 1/2 then 
condition (4.80) is satisfied for x = x*. 


Proof: The first assertion is obvious. For the second, we write we 
K (a) — K(2*) — K"(a*)(a — 2”) 


d * 1 * 
EK ue + ) — K"(a*)(x — x*)| dt 


[K’ (ta + (1 —t)a*) — K'(x*)](x — a*) dt, 


thus 


|| (a) — K(x") — K'(a")(a — a*)|ly 


& Ji [K" (to + (1 — #)0*) — K"(a*) (ae — 2*)|| dt 
0 
< [WReera-ne — I ||ccv,yy dt || K"(x*) (a — x*)|ly 
0 
= C f tate — 2" IK ("a — 2") Iv 
0 


C * * * 
= 9 lle— 2" lx IK'@*)(@— 2") lly 
which proves (4.37). We continue and estimate 


|K(@) — K(e*) — K'(a*)(2- 2*)||y 


< S| —a* |x ||K'(o")(@ — 2") + K(a") — K(@) ly 


C : 
+ 5 lle — 2° ]]x ||K(2") — K(2)Ily 
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and thus 


Q—Cl|x—2*||x) ||K(2)-K@*)-K'@')(a—-2’)|| -<C ||e—2* ||x ||K@*)—K(2)lly - 


This proves (4.38) because ||a — x*||x < p. 


Under this Assumption 4.40 (and 2* € B(%, p/2)), we know from Lemma 4.39 
that ae 5) converges to x* as 6 + 0. For the proof of the order of convergence, 
we need the following elementary estimates: 


Lemma 4.42 
1—-(1— 2\k 
Pee) 2 ae Saeed we (0, 1], (4.39) 
xL 
1 
a(l ay’ < eat for allk EN and x € [0,1], (4.40) 
1 
ria = a for allk € N and «x € [0,1], (4.41) 


for every a € (0,1) there exists c(a) > 0 with 


k-1 
G4“ hag-F =< ——e for alk EN. (4.42) 


j=0 


Proof: To show (4.39) let first 0 < x <1/Wk. Bernoulli’s inequality implies 
(1—2?)* > 1—ka?, thus Ce < a = ka < Vk. Let now « > 1/Vk. 
Then et < 4 < Vk. For x = 0 the estimate follows by l’Hospital’s rule. 
The proofs of estimates (4.40) and (4.41) are elementary and left to the reader 
(see also the proof of Theorem 2.8). 

Proof of (4.42): Define the function f(2) = (a + 1)~°¢(k — #)~3/? for -1 < 
a < k. By the explicit computation of the second derivative, one obtains that 
f’ (x) => 0 for all —1 <x <k.® Taylor’s formula yields for j < k 


fa) = £0) + PD@-D + FF ees? > HD + FOO-A, 


for 7 —1/2 <a <j+1/2 with some intermediate point z;,., thus 


8Therefore, f is convex. 
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j+1/2 
because the integral f (x —j) dx vanishes by the symmetry of the intervals. 
j-1/2 
Therefore, 
k-1 h1 pai 
(§+1)(R-j) A? = iOS f(x) da = 
j=0 j=0 =0; “179 
k/2 k—-1/2 
= ' f(x) dx + / f(a) dx 
-1/2 k/2 
For the first integral, we have 
k/2 3/2 k/2 
i f(a)dx < (=) i (a +1) °° dx 
—1/2 —1/2 
If a < 1 then 
k/2 
2 3/2 1 k l-a 
< <G —a-—1/2 
| fae < (7) = (5 i) < Ga)k 
-1/2 
If a= 1 then 
k/2 
/ Ce gen ba (k/2) < ee 
f(a)dx < Z n are 
-1/2 


For the remaining integral, we estimate 


k—-1/2 k—-1/2 


f(x)dx < (7) / (k—2)~7/?* dx = 2(2) 


k/2 k/2 


lA 

INS) 
——~ 
|b 
Ny 
Q 

S 


This finishes the proof. 


(4.39), (4.40), and (4.41) directly imply 


)- 12 k-1/2 
k/2 


Corollary 4.43 Let A: X + Y be a linear and compact operator with adjoint 


A* and let ||Al|ccx,y) <1. Then 


(a) ||(I— A*A)*A*lewy,x) S (k+1)7¥?, 
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(b) : — AA*)FAA*||ccyy) < (+17? 


Clip se A* A)i A* < Vk. 


=0 


L(Y,X) 


Proof: Let {i,2i,y; : 7 € I} be a singular system for A, see Appendix A.6, 
Theorem A.57. For y= yo + je, % Yi € Y with A*yo = 0, we have 


* * 2 
(2 — A*A)* AT IX = D(A Ho ] a? < mi a; Ss < plu 


because of (4.40). The part (b) follows in the way from (4.41). Finally, we 
observe that 


— A*A \iAty = _ Se (1 — p?) ps “Yi = 5 ST aah 


0 j=0 i€l el Mi 


iP 


-1 


j 
and thus by (4.39) 


2 


k- 
yu A*A)Aty] < k Soa? < kilyllz- 
g=0 


y iél 


Now we are able to prove the main theorem on the order of convergence. The 
proof is lengthy. 


Theorem 4.44 Let Assumption 4.40 hold, and let z* € B(%, p) be the minimum 
norm solution of K(x) = y* with respect to . Define the constant 


1—2n ) 
Ce = 2(1 + ——— 
( (5 — 4n) 
where y < 1/2 is the constant of Assumption 4.32. Furthermore, let the source 
condition hold; that is, there exists w € Y such that «* — % = K'(ax*)*w and 


5 Ce max{c(1/2),c(1)} |jw|ly < 1, 


where C is the constant of Assumption 4.40 and c(a) are the constants of (4.42) 
fora=1/2 anda=1. Then there exists a constant c > 0 such that 


zg — alle S clwll¥?s"?, NK (@hgy) — "ly < (+r). (443) 


Again, k,.(0) is the iteration index determined by the stopping rule (4.36). 
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Proof: The second estimate follows immediately from the property of k, (6) 


and the triangle inequality. Set for abbreviation e, = x2 — x* and A= K'(2*). 
Then K’(a?,) = R,3 A because of Assumption 4.40 and thus 


€k41 = Ck — K' (ay) (K(ak) — y°) 
= ex — A*(K(ax) —y?) + AX — Ryx)(K(ak) — y’) 


= (I[-A*A)e, — A*(K(x})—y’ — K’(2")ex) 
he AOE Ris) (K(k) —y’) 


= (I-A*A)ex + - —y") — A*[K (ak) — K(a*) — K'(a") (ah — #*)| 
+ A*(I — Ris)(K (ak) -y’) 
= (I—A*A)e, + A* # ee + A*ze, k=0,...,ke(6)-1, (4.44) 


with 


me = (I-R%s)(K (at) —y°) — [K(@t) — K(a*) — K'(x")(aj — 2*)] . (4.48) 


This is a recursion formula for e;,. It is solved in terms of y* — y* and 2; by 


k-1 
Ck = (I - A* A) ‘eo + SoU- A*A)/ A*[(y? — yf") + Z—j- 1 
j=0 
k-1 
= —(I-A*A)*Atw + |S >(— A* A) A] (y? — y") 
j=0 
k-1 
+ SOU - A*A At 5-1 (4.46) 
j=0 


for k = 0,...,k,(6). (Proof by induction with respect to k.) Furthermore, 
because A(I _ A A)ji=(I-AA*)IA, 
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k-1 
Aen = (I-AA) PA ATW + SO(I- AAA AY —y") 
j=0 


r 
il 


+ SUH AANIA Ate ae 


j=0 
k-1 
= -(I-AA*)"AAtw — So[(I- AA*)itt - (I - AA*)Y](y® - yy") 
j=0 
k-1 


(I—AA*)IAA* zy 5-1 


io} 


i 
= —([-AA*)*AA*tw + [I-(I- AA*)*](y? —y"*) 
k-1 
+ SOU- AAI AAT 5-1 (4.47) 


j=0 


for k =0,...,k,(6). Now we consider z, from (4.45) for k € {0,...,k,(6) — 1} 
and estimate both terms of 2 separately. Since r > 2 and || K(x) — y®lly > 
rd > 26 we have 


: . 1 
|K(ah)—y' lly < |K(@2)-y"lly + 6 < |K(@h) -y" lly + 5 A (ak) — "lly 


and thus || K(x) — y°|ly < 2||K(«%) — y*|ly. With (4.31) and because n < 1/2 
we have 


||€@- Ris) (K(#h)—-9) lly SI Ros llecwyy IK (#2) lly < 4C |lexllx Aexlly - 


For second term of z, we use (4.37) for x = x2: 


* * * C 
||K@e) — K(@*) — K"(2")(2,-2")||y < 9g lleallx Il Aexlly 
and thus \|zelly < 30 llex||x || Aex|ly- 
Now we estimate 6 from above. Because of the stopping rule (4.36) and (4.31) 
we have for k = 0,...,k«(6) — 1 
1 
rd < ||K(at)—y'lly < |K(ai)-K(2*)lly + 6 < Toy [Mealy + 4, 


20m) 7 = 1447 
2n 1—2n’ 


and thus, because r — 1 > 


1 — 2n 


7 4 dl =) IAeelly B= 0,i..,c4(0)—1. (4.48) 
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Now we show 


1 
llejllx S oxllwlly Fa |Aejlly < Gully 73 (4.49) 
and 66 
5 < 2= 20) [hwy is 
n(5—4n) k+1 


for 7 = 0,...,k and every k < k,(6) — 1. We show these three estimates 
by induction with respect to k < k,(d). These are true for k = 0 because 
lleollx < |lwlly and ||Aeo|ly < |lw|ly and by (4.48) and qa < a. 
Let now (4.49) and (4.50) hold for k — 1. From the above representation (4.46) 
of ex, parts (a) and (c) of Corollary 4.43, the assumption of induction, and 
(4.42) for a = 1/2 we conclude that 


1 


Wily 9 = 1 
< Ae inh eh ae a Abi go 
llexIlx < 4 5 > a Ile j—1|x|| Aex—j—ailly 
w 9 = At 1 i 
Yy 2 2 
< + Vk6 + ~Ce : =. 
= s/h g°° lls DET Jed oar 
Wily 9 2 2 c(1/2) 
< Vk6 Cc 
= VE+1 g Cee llwlly 
2 ELLE 4 eolly 


VkE+1 
by assumption on ||w||y. Analogously, it follows for ||Ae,||y with (4.47) and 
part (b) of Corollary 4.43 and (4.42) for a = 1 with constant c(1): 


k-1 
w 9 1 
|Aexlly << ae ded 3 O Ds Fy 7 lew—sallxl4en—s-ally 
7=0 
k-1 
wlly oe i « 
eA ge gts 2 
= k+l gO lully > 555 Jey RI 
Wily ae 2 ¢(1) 
S yar tot 5Celully 4 
Ilwlly 
< 6+2 : 
~ k+1 


Next we substitute this estimate of ||Aex||y into the estimate (4.48) for 6 and 
obtain 


1 —2n 


= seats 
< a@+@)0-9) arantcm Ot? ee: 
—2 4 


Since 1 CES Oe = aa, it follows estimate (4.50) for k; that is, 


|Aexlly < 
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We substitute this into the estimates of ||ex||x and || Aex|ly: 


lleellx < 21 eat | [wily _ ‘ Ilewlly 
7 (5 — 4n) |} Jk+1 VkE+1’ 
1—2n | |lwlly \lw||y- 
l|Aexlly < | —") ally =, fel 


This shows (4.49) for k. Therefore, (4.49) and (4.50) are proven for all 0 < 
j,k < ky (0). 


In the case k,(6) > 1, we take k = k,(6) — 1 in (4.50) and obtain 


From the estimate || K(x?) — y®|ly < 2|| K(x?) — K(2*)\ly < 7 |Aex|ly < 
4||Aez||y for k < k,(0) it follows that 


k =0,...,ky(5) —1. (4.52) 


5 1 
5 
|X (xy) = y lly < 4¢,|lwlly bri , 


Now we come to the final part of the proof. We write e, from (4.46) in der 
form 


k-1 
ek = —(1—A*A)*Atw + |S (0 — A* A) A*|(y? — y") 


j=0 
+ Ee A* A)? A* zp 5-1 


k-1 
= -A*(I- AA*)Rw + |S 0 - A* A) A" (y? — y") 
j=0 
k-1 
4 APY Ta AMY gpa 
j=0 
k-1 
= Atw, + (I — A* A) A* | (y? — y*) (4.53) 
j=0 


for k = 0,...,k.(6) with 


k— 
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Since ||(I — AA*)*||c¢y,y) < 1 for all j, we conclude 


om k-1 
9 
lwlly + SO Wlee—j—ally < llwlly + 5D llex—s—allx |Aex—j-ally 


wally < 
j=0 j=0 
k-1 oo 
9 9) 12 1 1 2 1 
< w\ly + = Cc{||w —_ < |lw v5 208 w = 
< tlle +5 Cetlol D0 ema pay < Illy wll > an 
ae 
Selly Jt + sod Daf 
j= 


for all k = 0,...,k,(6). Here we used without loss of generality that ||w||y < 1. 
The bound on the right-hand side is independent of 6. Similarly, we estimate 


|| AA*wz|ly. Indeed, with 


k-1 
AA* we = Aen — |AS (I- A*A)A*|(y9 — y*) 
j=0 
from (4.53) and 
k-1 ; k-1 ' 
A) (U-A*A) AT = SO(- AA*) AA® 
j=0 j=0 
k-1 k-1 
= (I— AA*)**? 4 S“(r— AA*)? = I-(I- AA*)* 
j=0 j=0 
we estimate 
k-1 ; 
|AA* wally < ||Aexlly + Jade - Araya ly? — y*lly 
j=0 L(Y.Y) 


< ||Aex|ly + |Z -— 7 - AA*)* ||, 6 < ||Aex|ly + 6 


for k = 0,...,k,(0). For k = k,(0) the estimate (4.31) and the stopping rule 


imply that 
AAT wryly < +n)|K(ek.cy) —y' lly +6 
< (+n) [IK@i.) -y'lly +6) + 6 < [(A+n)t+r)+1]6. 
This implies 
Atwell = (A* we. (8) AWE.) x = (AA* WE. (6), Wee (6) y 


x 


|AA* we, lly Hwrcally < e149 lwlly 
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with some constant c; which is independent of 6 and w. Now we go back to 
(4.53) and get, using part (c) of Corollary 4.43 and (4.51) (in the case k,.(6) > 1, 
otherwise directly) 


llex.cyllx < Veid|lwlly + 


5 


L(X.Y) 


«(d)— 
“S| (T=AtAyA* 
= 


< Verd|lwlly + /k.(d)6 
< VedTuly + \/2C— 7 hill? VE < elwily? v6. 


(5 — 4n) 


This, finally, ends the proof. 


The Landweber iteration is only one member of the large class of iterative regu- 
larization methods. In particular, Newton-type methods, combined with various 
forms of regularization, have been investigated in the past and are subject of 
current research. We refer to the monograph [149] of Kaltenbacher, Neubauer, 
and Scherzer. 


4.4 Problems 


4.1 Let kK: X > D(K) > K bea mapping with ||Z—-2||x < cl|K(Z)—K(z)|ly 
for all x, % € B(«x*, p) where c is independent of ¢ and x. Set y* = K(a*). 
Show that the equation K(x) = y* is not locally ill-posed in the sense of 
Definition 4.1. 


4.2 Define the integral operator (auto-convolution) K from L7(0, 1) into itself 
by 


t 


K(a)(t) = fu —s)a(s)ds, t€(0,1). 
0 
(a) Show that K is well-defined; that is, that K(x) € L?(0,1) for every 
x € L7(0,1). Remark: K is even well-defined as a mapping from L7(0, 1) 
into C[0,1]. One try to prove this. 


(b) Show that K(a) = y is locally ill-posed in the sense of Definition 4.1 
in every z* € L?(0,1) with x*(t) > 0 for almost all ¢ € (0,1). 


Hint: For any r > 0 and n € N define x, € L7(0,1) by 


7 a*(t), 0<t<1-1/n, 
Ey(t) = { (t)ht+rJ/n, 1-l1/n<t<l. 


4.3 Show that for every t € [0,2] there exists c; > 0 such that 


t/2-1 


< Ga for all u,a> 0. 


wrtea 
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4.4 


4.5 


4.6 


4.7 
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Show that ||K(2%)-°) — y? + aw|ly = O(6) as 6 > 0 for the choice 
c_O2/(+)) < a(d) < c462/(°+) and «© and w as in Theorem 4.15. 


(a) Show that |]ulloo < ||u’||z2(0,1) for all u € H*(0,1) with u(0) =0. 


) 
(b) Show that H?(0,1) is compactly imbedded in C[0, 1]. 


c) Show that the solution u € H?(0,1) of u” = —g in (0,1) and u(0) = 
u(1) = 0 is given by u(t) = i. G(t,s)g(s)ds where G(t,s) is defined in 
Lemma 4.16. 


Let the affine functions f, : Rso — R for n € N be given by f,(t) = 
Ynt+nn, t > 0, where Yn, 7 > 0. Show that the function y(t) := inf fr(t), 
ne 


t > 0, is continuous, monotonic, and concave. 
(a) Show that the set Us from Corollary 4.18 is open in L7(0, 1). 


(b) Let K : Us + H?(0,1) be the operator from Theorem 4.19; that is, 
c+ uwhere u € H?(0,1) solves the boundary value problem (4.11). Show 
that this K satisfies the Tangential Cone Condition of Assumption 4.32. 


Hint: Modify the differential equation (4.15) such that u — % appears on 
the right hand side and continue as in the proof of Theorem 4.19. 


Check for 
updates 


Chapter 5 


Inverse Eigenvalue 
Problems 


5.1 Introduction 


Inverse eigenvalue problems are not only interesting in their own right, but 
also have important practical applications. We recall the fundamental paper by 
Kac [148]. Other applications appear in parameter identification problems for 
parabolic or hyperbolic differential equations—as we study in Section 5.6 for a 
model problem—(see also [167, 187, 255]) or in grating theory ((156]). 

We study the Sturm-—Liouville eigenvalue problem in canonical form. The 
direct problem is to determine the eigenvalues and the corresponding eigen- 
functions u 4 0 such that 


= + g(x)u(z) = Au(z), O<2<1, (5.1a) 
u(0)=0 and hu'(1)+ Hu(1) =0, (5.1b) 


where q € L?(0,1) and h, H € R with h?+ H? > O are given. In this chapter, we 
assume that all functions are real-valued. In some applications, e.g., in grating 
theory, complex-valued functions q are also of practical importance. Essentially, 
all of the results of this chapter hold also for complex-valued q and are proven 
mainly by the same arguments. We refer to the remarks at the end of each 
section. 

The eigenvalue problem (5.1a), (5.1b) is a special case of the more general 
eigenvalue problem to determine p € R and non-vanishing w such that 


d dw(t) _ 
5 (PO?) + Lort)— a(t] wie) = 9, telat], —— (620) 
aw (a) +B, w(a)=0, apw'(b) + Byw(b) =0. (5.2b) 
© Springer Nature Switzerland AG 2021 169 
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Here p, 7, and g are given functions with p(t) > 0 and r(t) > 0 for ¢ € [a,)], 
and Qa, ap, Ba, 8) € R are constants with a? + 6? > 0 and a? + 6? > 0. If 
we assume, however, that g € Cla,b| and p,r € C?{a,b], then the Liouville 
transformation reduces the eigenvalue problem (5.2a), (5.2b) to the canonical 
form (5.1a), (5.1b). In particular, we define 


a(t) i= iO ds, té€l|a,b], (5.4) 


and the new function u : [0,1] > R by u(z) := f(t(x)) w(t(2)), x € [0, 1], where 
t = t(x) denotes the inverse of x = a(t). Elementary calculations show that u 
satisfies the differential equation (5.1a) with A = L?p and 


g(a) = DP 2 _ fi Can) | (5.5) 


t=t(a) 


Also, it is easily checked that the boundary conditions (5.2b) are mapped into 
the boundary conditions 


hou’(0) + Hou(0) =0 and hyu/(1)+ Hyu(1) =0 (5.6) 


with ho = ago(a)/(L f(a)) and Ho = 8./f(a)—aaf’(a)/ f(a)? and, analogously, 
h,, H, with a replaced by b. 

In this chapter, we restrict ourselves to the study of the canonical Sturm— 
Liouville eigenvalue problem (5.1la), (5.1b). In the first part, we study the case 
h = 0 in some detail. At the end of Section 5.3, we briefly discuss the case 
where h = 1. In Section 5.3, we prove that there exists a countable number of 
eigenvalues .,, of this problem and also prove an asymptotic formula. Because 
q is real-valued, the problem is self-adjoint, and the existence of a countable 
number of eigenvalues follows from the general spectral theorem of functional 
analysis (see Appendix A.6, Theorem A.53). Because this general theorem 
provides only the information that the eigenvalues tend to infinity, we need 
other tools to obtain more information about the rate of convergence. The basic 
ingredient in the proof of the asymptotic formula is the asymptotic behavior of 
the fundamental system of the differential equation (5.1a) as |A| tends to infinity. 
Although all of the data and the eigenvalues are real-valued, we use results from 
complex analysis, in particular, Rouché’s theorem. This makes it necessary to 
allow the parameter A in the fundamental system to be complex-valued. The 
existence of a fundamental solution and its asymptotics is the subject of the 
next section. 
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Section 5.5 is devoted to the corresponding inverse problem: Given the eigen- 
values ,,, determine the function qg. In Section 5.6, we demonstrate how inverse 
spectral problems arise in a parameter identification problem for a parabolic ini- 
tial value problem. Section 5.7, finally, studies numerical procedures for recov- 
ering q that have been suggested by Rundell and others (see [186, 233, 234]). 

We finish this section with a “negative” result, as seen in Example 5.1. 


Example 5.1 
Let A be an eigenvalue and u a corresponding eigenfunction of 


—u" (x) + q(x) u(x) = Au(xz), O< a@<1, u(0)=0, u(1) =0. 


Then . is also an eigenvalue with corresponding eigenfunction v(x) := u(1— <x) 
of the eigenvalue problem 


—v" (x) + G(x) (x) =Av(xz), O0< a<1, v(0) =0, v(1) =0, 
where g() := q(1— 2). 


This example shows that it is generally impossible to recover the function 
q unless more information is available. We will see that q can be recovered 
uniquely, provided we know that it is an even function with respect to 1/2 or 
if we know a second spectrum; that is, a spectrum for a boundary condition 
different from u(1) = 0. 


5.2 Construction of a Fundamental System 


It is well-known from the theory of linear ordinary differential equations that 
the following initial value problems are uniquely solvable for every fixed (real- 
or complex-valued) g € C[0, 1] and every given X € C: 


—u{tq(r)u=Au,0<a<1, w(0)=1, ui(0)=0 (5.7a) 


—ug + 9(x) 2 =Au2,O<e<1, w(0) =0, uw, (0) =1. (5.7b) 


Uniqueness and existence for g € L7(0,1) is shown in Theorem 5.4 below. The 
set of functions {u1, ug} is called a fundamental system of the differential equa- 
tion —u”’+qu = Au in (0,1). The functions wu; and wz are linearly independent 
because the Wronskian determinant is one 


[uz, U2] := act o e | = ujyu,—uu = 1. (5.8) 
Uy Ug 
This is seen from 
d A / 
Gy ltt va = uy — Ulu = ur(q—A) U2 u2(q—A)ur = 0 


and [u1, u2|(0) = 1. The functions wu, and uz depend on A and gq. We express 
this dependence often by u; = u;(-,A,q), j = 1,2. For gq € L7(0, 1), the solution 
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is not twice continuously differentiable anymore but is only an element of the 
Sobolev space 


(0,1) 2= fu €C1(0,1]: u'(z) =at fo dt, aE C, ve 0,1), 
0 
see (1.24). We write u” for v and observe that uw” € L?(0,1). The most 


important example is when g = 0. In this case, we can solve (5.7a) and (5.7b) 
explicitly and have the following: 


Example 5.2 
Let g =0. Then the solutions of (5.7a) and (5.7b) are given by 
u1(x,A,0) =cos(VAx) and u2(a,A,0) = svi) ; (5.9) 


respectively. An arbitrary branch of the square root can be taken because 
s+ cos(sz) and s++ sin(sa)/s are even functions. 


We will see that the fundamental solution for any function g € L?(0,1) 
behaves as (5.9) as |A| tends to infinity. For the proof of the next theorem, we 
need the following technical lemma. 


Lemma 5.3 Let q € L(0,1) and k,k € C[0,1] such that there exists up > 0 
with |k(r)| < exp(ut) and |k(r)| < exp(ur) for all 7 € [0,1]. Let K,K : 
C[0, 1] + C[0,1] be the Volterra integral operators with kernels k(x —t) q(t) and 
k(a — t) q(t), respectively; that is, 


(Ké)(x) = [Hed alate ar, 0<n<i, 
0 


and analogously for K. Then the following estimate holds: 
7 1 
(A K""'4)(2)| < [IPlloo 5 a(a)"e"*, OS eS, (5.10) 


for all ¢ € C[0,1] and alin EN. Here, G(x) := Jo la(t)| dt. If ¢ € C0, 1 
satisfies also the estimate |¢(T)| < exp(ur) for all r € [0,1], then we have 


|(K K"-'¢)(x)| < —G(ax)"e"*, O<ae<1, (5.11) 


~ nl 
for alin EN. 


Proof: We prove the estimates by induction with respect to n. 
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For n = 1, we estimate 


(Ke)(2)| = [He - natty oe 


A 


= [allac f ev Jatt at < |I¢lloo eM” G(x). 


0 


Now we assume the validity of (5.10) for n. Because it holds also for K = K, 


we estimate 
xz 


(K K"4)(2)| < / eh(e—O |a(t)| (KS) (E)| dt 


< [ilo et f lalt) ate)" at 
0 


We compute the last integral by 


- la(t)| q(t)" dt = / l(t) qty" d 
0 


1 
n+1 


This proves the estimate (5.10) for n+ 1. 
For estimate (5.11), we only change the initial step n = 1 into 


a +1) 
ae Ne dt 
sf sla 
0 


(oy 


x 


\(K4)(2)| < / MOO ela(t)| dt < ef G(x). 


0 


The remaining part is proven by the same arguments. 


Now we prove the equivalence of the initial value problems for u;, 7 = 1,2, 
to Volterra integral equations. 


Theorem 5.4 Let q € L7(0,1) and A €C. Then we have 


(a) uy, U2 € H?(0,1) are solutions of (5.7a) and (5.7b), respectively, if and 
only if ui, u2 € C[0,1] solve the Volterra integral equations: 


u(x) = cos(VAz) 4 / sin VA(a — 2) q(t) us (t) dt, (5.12a) 


J vx 
Ug(x) = save) + [PAE arty uat at, (5.12b) 


respectively, forO <a <1. 
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(b) The integral equations (5.12a) and (5.12b) and the initial value problems 
(5.7a) and (5.7) are uniquely solvable. The solutions can be represented 
by a Neumann series. Let K denote the integral operator 


/ sin /A(x — t) 


(Ké)(x) := Tx q(t) dt) dt, «x € [0,1], (5.13) 
and define 
C(x) := cos(VAx) and S(x) := nw) (5.14) 
Then so a 
= ko and i= > KS. (5.15) 
n=0 n=0 


The series converge uniformly with respect to (x, A,q) € [0,1] x A x Q for 
all bounded sets AC C and Q Cc L?(0,1). 


Proof: (a) We use the following version of partial integration for f,g € H?(0,1): 
(5.16) 


We restrict ourselves to the proof for ui. Let uy be a solution of (5.7a). Then 


[se-ou t)uy(t)dt = [se-» [A ui(t) + wf (t)] dt 


= ur(t)|AS(a@—t)+ S”(x —t)| dt 
foogseos 


=0 
+ [ui (t) S(a@ —t) + u(t) S"(a — t)] 
ur(a) — cos(V Az) ; 


On the other hand, let ui € C[0, 1] be a solution of the es ame (5. eA 
The operator A: 1(0, 1) + L?(0,1) defined by (A¢)(x) = fy S(a — t) o(t) dt, 

€ (0,1), is bounded, and it is easily seen that ae + (Ad) = ¢ a ge 
Clo. 1]. Therefore, A is even bounded from L?(0,1) into H?(0,1). Writing 
(5.12a) in the form uy = C+ A(quz) yields that u; € H?(0,1) and uf = 
—AC + qu, — AA(qui) = qu; — Au,. This proves the assertion because the initial 
conditions are obviously satisfied. 


(b) We observe that all of the functions k(t) = cos(WXr), k(r) = sin(VX7), 
and k(r) = sin(VAr)/VX for 7 € [0,1] satisfy the estimate |k(7)| < exp(u7) 
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with ps = |ImV)|. This is obvious for the first two functions. For the third, it 
follows from 


stv) 


T 


< fini (VAs)|ds = [cosh ds < cosh(ur) < eM”. 
0 


We have to study the integral operator K with kernel k(x—t)q(t), where k(r) = 
sin(VAr)/VX. We apply Lemma 5.3 with K = K. Estimate (5.10) yields 
a(q)n 
|K" |lo0 < aU et <1 
n! 
for sufficiently large n uniformly for q € Q and A € A. Therefore, the Neumann 
series converges (see Appendix A.3, Theorem A.31), and part (b) is proven. 


The integral representation of the previous theorem yields the following 
asymptotic behavior of the fundamental system by comparing the case for arbi- 
trary q with the case of qg = 0. 


Theorem 5.5 Let q € L7(0,1), XE C, and uz, uz be the fundamental system; 
that is, the solutions of the initial value problems (5.7a) and (5.7b), respectively. 
Then we have for all x € [0,1]: 


1 _ x 
|us (x) —cos(VAx)| << — = exp(|ImV)|\x+ | |q(t)| de (5.17a) 
ra we (limvae + f cota) 


sin(V/Ax) 1 / 
u(x) Dx < ry cso(tunvala+ facia) (5.17b) 
Jui (x) + VAsin(VAx)| << exp( [mV i ja(t)| de) (5.17c) 
0 


1 x 
| w(x) — cos(V/Az) < —— exp(|ImVA\a+ | |q(t)| dt (5.17d) 
pa n(n vale + f ola} 


Proof: Again, we use the Neumann series and define C(r) := cos(VAT) and 
S(r) := sin(VAr)/Vd. = Let K be the integral operator with kernel 
q(t) sin(WA(a — t))/WA. Then 


Jur (x) — cos(VAx)| < S> |(K"C)(a) 


Now we set k(r) = sin(WXr) and k(r) = sin(V/AT)/VA and denote by K and 
K the Volterra integral operators with kernels k(a—t) and k(a—t), respectively. 
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Then K” = ray K K"—! and, by Lemma 5.3, part (b), we conclude that 


1 r n 
(A"C)(2)| <— ——— la(t)| dt} exp(|Im VA x) 
|WA| n! / ) 


for n > 1. Summation now yields the desired estimate: 


a 1 7 
|ui(x) — cos(WAx)| < exp( |ImV\\ a+ | |q(t)| dt J. 
ye (mies f mia 


Because |S(x)| < aa exp(|ImV |), the same arguments prove the esti- 


mate (5.17b). Differentiation of the integral equations (5.12a) and (5.12b) yields 


x 


uj(c) + Vxrsin(VAr) = [cos Vix 1) q(t) ur(t) dt, 


0 
x 


uy(z) — cos(VrAr) = [eos Va —1) q(t) u2(t) dt . 


0 


With K as before and K defined as the operator with kernel q(t) cos VA(a — t). 
Then 


ui(z) + VAsin(VAx) = RKC, 
n=0 

u(x) — cos(VAxr) = RRS, 
n=0 


and we use Lemma 5.3, estimate (5.11), again. Summation yields the estimates 
(5.17c) and (5.17d). 


In the next section, we need the fact that the eigenfunctions are continuously 
differentiable with respect to gq and ’. We remind the reader of the concept 
of Fréchet differentiability (F-differentiability) of an operator between Banach 
spaces X and Y (see Appendix A.7, Definition A.60). Here we consider the 
mapping (A,q) > u;(-,A,q) from C x L?7(0,1) into C[0,1] for j = 1,2. We 
denote these mappings by u,; again and prove the following theorem: 


Theorem 5.6 Let u; : C x L?(0,1) + C[0, 1], 7 = 1,2, be the solution operator 
of (5.7a) and (5.7), respectively. Then we have the following: 


a) u; 1s continuous. 
j 
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(b) u; is continuously F-differentiable for every (4) € C x L2(0,1) with 
partial derivatives 


0 y A - a 
pr vio Ad) = Uy,r(-, A, @) (5.18a) 
and 5 
Hq iA) (q) = UjglA.4), (5.18b) 


where uj(-,A,G) and ujq(-,\,@) are solutions of the following initial 
boundary value problems for 7 = 1,2: 


Uz) + (¢—A)uj,r = uj,(-,A,g) in (0,1) 
3 (0) = 0, uj (0) = 0, (5.19) 
=, (¢—A)ujq = —quj;(-,A,g) in (0,1), 
Uj.q(0) = 0, uy, (0) = 0 
(c) Furthermore, for all x € [0,1] we have 
[uslerat 2s. Tega, facie, (5.20a) 
0 
fut ug(t)dt = [ui,,u2|(@) = [ue,,,ui](2), (5.20b) 
0 
= / g(t) uj(t)?dt = [ujq,uj)(2), 7=1,2, (5.20) 
0 
- [ ur(t) ue(t) dt = [ur.q,ue|(z) = [ue,,ui](z), (5.20d) 
0 


where [u,v] denotes the Wronskian determinant from (5.8). 


Proof: (a), (b): Continuity and differentiability of u; follow from the inte- 
gral equations (5.12a) and (5.12b) because the kernel and the right-hand sides 
depend continuously and differentiably on and g. It remains to show the 
representation of the derivatives in (b). Let u=u,;, 7 =1 or 2. Then 
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Furthermore, the homogeneous initial conditions are satisfied for the difference 
quotient. The right-hand side converges uniformly to u(-,A) as « > 0. There- 
fore, the difference quotient converges to u, uniformly in x. The same arguments 
yield the result for the derivative with respect to q. 
(c) Multiplication of the differential equation for u;,, by u; and the differ- 
ential equation for wu; by uj;,, and subtraction yields 
us(z) = uj(a) uj,(z) — uz (x) uj(2) 


~ - (uj (x) Uj,(2) a ul, \(x) tj (a)) : 


Integration of this equation and the homogeneous boundary conditions yield 
the first equation of (5.20a). The proofs for the remaining equations use the 
same arguments and are left to the reader. 


At no place in this section have we used the assumption that q is real-valued. 
Therefore, the assertions of Theorems 5.4, 5.5, and 5.6 also hold for complex- 
valued q. 


5.3 Asymptotics of the Eigenvalues and Ejigen- 
functions 


We first restrict ourselves to the Dirichlet problem; that is, the eigenvalue prob- 
lem 


—u'" (x) + q(x) u(x) =Au(z), O<a2<1, u(0)=u(1) =0. (5.21) 


We refer to the end of this section for different boundary conditions. Again, 
let q € L?(0,1) be real-valued. We observe that » € C is an eigenvalue of this 
problem if and only if 4 is a zero of the function 


f(A) := ue(l,A,qQ)- 


Again, u2 = u2(-,A,q) denotes the solution of the differential equation —u4 + 
qu2z = Aug in (0,1) with initial conditions u2(0) = 0 and ui(0) = 1. If 
ua(1,A,q) = 0, then u = ua(-,A,q) is an eigenfunction corresponding to the 
eigenvalue 4, normalized such that u’(0) = 1. There are different ways to nor- 
malize the eigenfunctions. Later we will sometimes normalize them such that 
the L?—norms are one; that is, use g = u/||u||z2 instead of u. 

The function f plays exactly the role of the well-known characteristic poly- 
nomial for matrices and is, therefore, called the characteristic function of the 
eigenvalue problem. Theorem 5.6 implies that f is differentiable; that is, ana- 
lytic in all of C. This observation makes it possible to use tools from complex 
analysis. First, we summarize well-known facts about eigenvalues and eigen- 
functions for the Sturm—Liouville problem. 


Theorem 5.7 Let g € L7(0,1) be real-valued. Then 
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(a) All eigenvalues X are real. 


(b) There exists a countable number of real eigenvalues ;, 7 € N, which tend 
to infinity as j + co. The corresponding eigenfunctions g; € C[0, 1], nor- 
malized by ||g;||z2 = 1, form a complete orthonormal system in L?(0,1). 


(c) The geometric and algebraic multiplicities of the eigenvalues r; are one; 
that is, the eigenspaces are one-dimensional and the zeros of the charac- 
teristic function f are simple. 


(d) Let the eigenvalues be ordered as \y < A2 < A3 <---. The eigenfunction 
g; corresponding to A; has exactly j — 1 zeros in (0,1). 


(e) Let q be even with respect to 1/2; that is, q/1—x) = q(x) for all x € [0,1]. 
Then g; is even with respect to 1/2 for odd j and odd with respect to 1/2 
for even j. 


Proof: (a) and (b) follow from the fact that the boundary value problem is 
self-adjoint. We refer to Problem 5.1 for a repetition of the proof (see also 
Theorems A.52 and A.53). 


(c) Let A be an eigenvalue and u, v be two corresponding eigenfunctions. Choose 
a, 6 with a? + 6? > 0, such that au’(0) = Bv'(0). The function w := au — Gv 
solves the differential equation and w(0) = w’(0) = 0; that is, w vanishes 
identically. Therefore, u and v are linearly dependent. 

We apply Theorem 5.6, part (c), to show that A is asimple zero of f. Because 
u2(1,A,¢q) = 0, we have from (5.20a) for 7 = 2 that 


f'Q) = —, = u2,r(1, A, @) 
1 
= kp 5 flea? dx # 0. (5.22) 
:q) 
0 


This proves part (c). 
(d) First we note that every g; has only a finite number of zeros. Otherwise, 
they would accumulate at some point x € [0,1], and it is not difficult to show 
that g; and also gj vanish at x. This would imply that g; vanishes identically. 
We fix j € N and define the function h : [0,1] x [0,1] > R by h(t,2) = 
u,(a; tq). Here, u;(-;tg) is the j-th eigenfunction u; corresponding to tq instead 
of g and normalized such that u/(0) = 1. Then h is continuously differentiable 
and h(t,0) = A(t, 1) = 0 and every zero of h(t,-) is simple. This holds for every 
t. By part (a) of Lemma 5.8 below, the number of zeros of h(t,-) is constant 
with respect to t. Therefore, u;(-,q) = A(1,-) has exactly the same number of 
zeros as h(0,x) = u;(x,0) = V2sin(jrx) which is j — 1. The normalization to 
9; = Uj(-,@)/|luj( @)||L2 does not change the number of zeros. 
(e) Again, it is sufficient to prove that vj = u2(-,A;,q) is even (or odd) for odd 
(or even) j. First, we note that also 0;(x) := v,;(1 — 2) is an eigenfunction. 
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Since the eigenspace is one-dimensional, we conclude that there exists p; € R 
with 0;(z) = vj;(1— 2) = pjv;(x) for all z For x = 1/2 this implies that 
(1 — p;)vj(1/2) = 0 and also by differentiation (1 + p;)vj(1/2) = 0 for all 
j- Since v;(1/2) and vj(1/2) cannot vanish simultaneously, we conclude that 
p; © {+1, -1} and even p; = p;vj(0) = —v; (1). From (5.22), we conclude that 
sign f’(A;) = sign v;(1) = — sign pj. Since Aj are the subsequent zeros of f, we 
conclude that f’(A;) f’(Aj41) < 0 for all j; that is, pj = o(—1)9*1, 7 EN, for 
some o € {+1,—1}. The first eigenfunction v; has no zero by part (d) which 
yields og = 1. This ends the proof. 


The first part of the following technical result has been used in the previous 
proof, the second part will be needed below. 


Lemma 5.8 (a) Let h: [0,1] x [0,1] > R be continuously differentiable such 
that h(t,-) has finitely many zeros in [0,1] and all are simple for every 
t € [0,1]. Then the number m(t) of zeros of h(t,-) is constant with respect 
to t. 


(b) Let z € C with |z —nn| > 7/4 for alln € Z. Then 
exp(|Imz|) < 4|sinz|. 


Proof: (a) It suffices to show that t +> m(t) is continuous. Fix f € [0,1] and 
let @;, j =1,...,m/(é), be the zeros of A(t,-). Because Oh(t,%;)/Ox # 0 there 
exist intervals T = (¢ — 6,£+ 6) [0,1] and J; = (@; — 6,@; + 6) [0,1] with 
Oh(t, z)/Ox # 0 for allt € T and x € U, J;. Therefore, for every t € T, the 
function h(t,-) has at most one zero in every J;. On the other hand, by the 
implicit function theorem (applicable because Oh(t, #;)/Ox 4 0), for every t € T 
there exists at least one zero of A(t,-) in every J;, where T and J; are possibly 
made smaller. Outside of UJ J; there are no zeros of h(t, -), and thus by making 
perhaps T smaller again, no zeros of A(t,-) either for all t € T. This shows 
that m(t) = m(é) for all t € T which shows that t +> m/(t) is continuous, thus 
constant. 

(b) Let w(z) = exp |z2|/|sin z| for z = 21 + iz2, 21,22 © R with 2 ¢ {n7: 
n € Z}. We consider two cases: 
1st case: |z2| > In2/2. Then 


2 el?2! 2 el?2! 2 
W(z) = < =~ = <4 


— |et1—22 _ e tz +22| —  elz2l — e-lzal 1 — e72122l 


because exp(—2|z9|) < 1/2. 
2nd case: |z2| < In2/2. From |z — na| > 7/4 for all n, we conclude that 
lz. — na|? > 17/16 — 23 > 17/16 — (In2)?/4 > 17/64; thus |sin z| > sin J. 
With | Resin z| = | sin z1| | cosh z2| > | sin z1|, we conclude that 
Ia 2 2 
wz) < << oN Mg 


~ |Resinz| ~ |sinz:| ~ |sin | 
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Now we prove the “counting lemma,” a first crude asymptotic formula for the 
eigenvalues. As the essential tool in the proof, we use the theorem of Rouché 
from complex analysis (see [2]), which we state for the convenience of the reader: 
Let U C C be a domain and the functions F and G be analytic in C and 
|F'(z) — G(z)| < |G(z)| for all z € OU. Then F and G have the same number of 


zeros in U. 
Lemma 5.9 Let q € L?(0,1) and N > 2 exp(||q||z1) be an integer. Then 


(a) The characteristic function f(A) := ue(1,A,q) has exactly N zeros in the 
half-plane 
H := {X€C:ReA<(N+1/2)?n*}. (5:23) 


(b) For everym > N there exists exactly one zero of f in the set 
Um i= {AEC: |WA— mr| < mp ahs (5.24) 
Here we take the branch with Re VA > 0. 


(c) There are no other zeros of f in C. 


Proof: We are going to apply Rouché’s theorem to the function F(z) = f(z?) = 
u2(1,27,q) and the corresponding function G of the eigenvalue problem for 
q = 0; that is, G(z) := sin z/z. For U, we take one of the sets W,, or Vr defined 
by 


Wm := {2€C:|z—ma| < 7/2}, 
Vr {z€C:|Rez| <(N+1/2)z, |Imz| < R} 


for fixed R > (N + 1/2) and want to apply Lemma 5.8: 

(i) First let z € OW: For n € Z,n 4 m, we have |z—na| > |m—n|r—|z—mz7| > 
ma —1/2> 1/4. For n =m, we observe that |z — ma| = 1/2 > 7/4. Therefore, 
we can apply Lemma 5.8 for z € OW,,. Furthermore, we note the estimate 
|z| > ma — |z —ma| = (m—1/2)r > Na > 2N for all z € OW yy. 

(ii) Let z € OVr, n € Z. Then | Rez| = (N+1/2)z or | Im z| = R. In either case, 
we estimate |z — nm|? = (Rez — nz)? + (Imz)? > 17/4 > 17/16. Therefore, 
we can apply Lemma 5.8 for z € OVp. Furthermore, we have the estimate 
|z| > (VN +1/2)m > 2N for all z € OVr. 


Application of Theorem 5.5 and Lemma 5.8 yields the following estimate for 
all z € OVR UOW : 
sin z 1 4|sinz| N 
rey- 22] < Fpew(limal + tla) < TS 
_ 2N |sinz sin z 
~ lal | 2 z | 
Therefore, F and G(z) := sinz/z have the same number of zeros in Vr and 


every W,,,. Because the zeros of G are tna, n = 1,2,..., we conclude that G 
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has exactly 2N zeros in Vp and exactly one zero in every W,,. By the theorem 
of Rouché, this also holds for F. 

Now we show that F’ has no zero outside of Vp UU, sn Wm. Again, we 
apply Lemma 5.8: Let z ¢ Ve UUs Wm. From z ¢ Vp, we conclude that 
|z| = \/(Re z)? + (Im z)? > (N+1/2)z. For n > N, we have that |z—nz| > 1/2 
because z ¢ W,,. For n < N, we conclude that |z—nz| > |z|-—n7a > (N+1/2— 
n)x > 2/2. We apply Theorem 5.5 and Lemma 5.8 again and use the second 


triangle inequality. This yields 
sin z 1 
F(Z] 2 I> ~ GE exp(|Im z| + |lq||z:) 
x sin z ‘ 4 exp(|lql|z) 
~ | 2 |z| 
> sin z j1- a 5 
z |z| 


because |z| > (N + 1/2)a > 2N. Therefore, we have shown that f has exactly 
one zero in every U,,, m > N, and N zeros in the set 


Hy = {X€C:0<ReVd < (N+1/2)z, 


Im V\| < R} 


and no other zeros. It remains to hoy bie Hp C H. For X= |A|exp(i0) € Hr, 
we conclude that Re VA = v/|A] cos  < (N +1/2)z; thus Re \ = |A| cos(28) < 
|A| cos? 2 < (N + 1/2)?n?. 


This lemma proves again the existence of infinitely many eigenvalues. The 
arguments are not changed for the case of complex-valued functions q. In this 
case, the general spectral theory is not applicable anymore because the boundary 
value problem is not self-adjoint. This lemma also provides more information 
about the eigenvalue distribution, even for the real-valued case. First, we order 
the eigenvalues in the form 


Ay < Ag < A3 < 
Lemma 5.9 implies that 
V/An = ne + O(1); thatis, A, = n?a? + O(n). (5.25) 


For the treatment of the inverse problem, it is necessary to improve this formula. 
It is our aim to prove that 


1 CO 
An = nen? + fo dt + \n where S*|An| <0. (5.26) 
0 


n=1 


There are several methods to prove (5.26). We follow the treatment in [218]. The 
key is to apply the fundamental theorem of calculus to the function t +> Xp, (tq) 
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for t € [0,1], thus connecting the eigenvalues A,, corresponding to g with the 
eigenvalues n?7? corresponding to g = 0 by the parameter t. For this approach, 
we need the differentiability of the eigenvalues with respect to q. 

For fixed n € N, the function g An(q) from L7(0,1) into C is well-defined 
and Fréchet differentiable by the following theorem. 


Theorem 5.10 For every n € N, the mapping q+ A\n(q) from L?(0,1) into C 
is continuously Fréchet differentiable for every @ € L?(0,1) and 


1 
Xa = f gold? ale)de, ge 1*(0,1), (5.27) 
0 
Here, 
, a(t Ans 4 
In(2; ) = Meni 
I|ua(-, Ang) gs 
denotes the L?-normalized eigenfunction corresponding to dn i= An(q). Note 


that the integral is well-defined because u2(-,\n,4) € H2(0,1) C C{0, 1]. 


Proof: We observe that u2(1, Ae g) = 0 and apply the implicit function theorem 
to the equation 


ua(1, A,q) =0 


in a neighborhood of Cus g). This is possible because the zero Aq. ot u2(1,-, g) 
is simple by Lemma 5.7. The implicit function theorem (see Appendix A.7, 
Theorem A.66) yields the existence of a unique function A, = An(q) such that 
u2(1,An(q),q) = 0 for all g in a neighborhood of g; we know this already. But 
it also implies that the function A,, is continuously differentiable with respect 
to q and 

7 aati, Gna De ee 

py u2' ’ nr) n(Q)q + aq 12! ’ ns 9)q3 

that is, w2,(1) A},(@)¢ + u2,q(1) = 0. With Theorem 5.6, part (c), we conclude 
that 


0= 


ben, Uayg(t) __ uayg(L) uh (1) 
(Od = “TG 7 wade) 

[ug wal) fy a(e) ua(a)2de - 

_ [u2,,,t2](1) “fits ade = [soles q(x) dx, 


where we have dropped the arguments \ and @. 


Now we are ready to formulate and prove the main theorem which follows: 


184 Inverse Eigenvalue Problems 


Theorem 5.11 Let Q Cc L7(0,1) be bounded, q € Q, and X, € C the corre- 
sponding eigenvalues. Then we have 


1 1 
An = nq? + fawa - [a cos(2nat) dt + O(1/n) (5.28) 
0 0 


forn — oo uniformly for q € Q. Furthermore, the corresponding eigenfunctions 
Gn, normalized to ||gn||z2 =1, have the following asymptotic behavior: 


gn(x) = V2sin(nrx) + O(1/n) and (5.29a) 
g(x) = V2n7cos(nax) + O(1) (5.29b) 


asn— co uniformly for « € [0,1] and q€ Q. 


We observe that the second integral on the right-hand side of (5.28) is the nth 
Fourier coefficient a, of q with respect to {cos(2mnt) >= 0,1, 254405 ie From 
Fourier theory, it is known that a, converges to zero, and even more: Bessel’s 
inequality (A.7) yields that S7°* 4 |an|? < 00; that is, (5.26) is satisfied. If q 
is smooth enough, e.g., continuously differentiable, then a, tends to zero faster 
than 1/n. In that case, this term is absorbed in the O(1/n) expression. 


Proof: We split the proof into four parts: 
(a) First, we show that gp(a) = V2 sin(VAnx) + O(1/n) uniformly for (2, q) € 
[0,1] x Q. By Lemma 5.9, we know that /A, = na + O(1), and thus by 
Theorem 5.5 

sin(V Anz) 


With the formula 2 i sin?(at) dt = 1 — sin(2a) /(2a), we compute 


Jrstrnta = if 2(W Ant) dt + O(1/n3) 
0 

[ 

ie. 


1 sin( (evn) 
= 5. | + O(1/n?) 
= ofa. 
Therefore, we have 
gn (ae) = 2A) V8 sin Juz) + O(n). 


i U2 G An) 2dt 


(b) Now we show that A, = nt+O(1/n) and gn(x) = V2sin(nrx) +O(1/n). 
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We apply the fundamental theorem of calculus and use Theorem 5.10 


d 
dt 


1 D4 
hme tn z,tq)*q(x) dxdt = O(1). 
0 0 0 


This yields /An = nat+O(1/n) and, with part (a), the asymptotic form g,(x) = 
V2 sin(nrx) + O(1/n). 

(c) Now the asymptotics of the eigenvalues follow easily from (5.30) by the 
observation that 


1 
An—w a? = dAnl(q)— y= / An(t (5.30) 


gn(x, tq)? = 2sin?(nax) + O(1/n) = 1 — cos(2nmx) + O(1/n), 
uniformly for ¢ € [0,1] and q€ Q. 


(d) Similarly, we have for the derivatives 


gi (x) = up(z, An) = V2V\n c0s(/Anx) ++ O(1) 
: Jo uolt, n)Pdt 1+ O(1/n) 


= vV2nm cos(nrx) + O(1). 


Example 5.12 
We illustrate Theorem 5.11 by the following two numerical examples: 
(a) Let qi(x) = exp(sin(27z)), x € [0,1]. Then q; is analytic and periodic with 
period 1. 

Plots of the characteristic functions A +» f(A) for q, and q = 0; that is, 
A+ sin V\/VX are shown in Figure 5.1. 


Figure 5.1: Characteristic functions of q, qi, respectively, on [0,20] and [5, 100] 
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(b) Let q(x) = —5 ax for 0 < « < 0.4 and q(x) = 4 for 0.4 <a<1. The 
function gz is not continuous. 

Plots of the characteristic functions A+> f(A) for q2 and q = 0 are shown in 
Figure 5.2. 


Figure 5.2: Characteristic functions of q, gz, respectively, on [0,20] and [5, 100] 


The Fourier coefficients of gq; converge to zero of exponential order. The 
following table shows the eigenvalues \,, corresponding to q,, the eigenvalues 


n?7? corresponding to q = 0 and the difference 


1 

t= he = vr = [oloyar forn =1,...,10: 
0 
2,2 


An nen Cn 

11.1 9.9 —2.04* 10-2 

40.9 39.5 1.49 x 107! 

90.1 88.8 2.73 « 1073 
159.2 157.9 —1.91 * 1073 
248.0 246.7 7.74 «10-4 
356.6 354.3 4.58 * 1074 
484.9 483.6 4.58 * 1074 
632.9 631.7 4.07 * 1074 
800.7 799.4 3.90 « 10-4 
988.2 987.0 3.83 « 1074 


We clearly observe the rapid convergence. 


Because gz is not continuous, the Fourier coefficients converge to zero only 
slowly. Again, we list the eigenvalues \,, for qa, the eigenvalues n?7? corre- 


sponding to q = 0, and the differences 


Cn 


| 
»~ 
3 
| 
3 
i] 
) 
bo 
| 
Q 
ay, 
8 
Nee 
Q 
8 
» 
i) 
Q 


0 
1 
dn t= An — n2n? - fw) dx + | q(x) cos(2rnx) dx 
0 


is 
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forn =1,..., 10: 


An nen? Cn dn 

12.1 9.9 1.86 «10-1 -1.46 «107! 

41.1 39.5 —3.87 * 107 8.86 * 1072 

91.1 88.8 3.14 * 107 2.13 * 1072 
159.8 157.9 1.61 *107-! —6.70 «1073 


248.8 246.7 2.07*1072 2.07 x 1072 
357.4 354.3 8.29*10-2 —4.24« 1073 
484.5 483.6 —1.25 *107 6.17 «1073 
633.8 631.7 1.16 * 107 3.91 * 1073 
801.4 799.4 —6.66*1072 —1.38 x 107% 
989.0 987.0 5.43*107% 5.43* 1078 


Now we sketch the modifications necessary for Sturm—Liouville eigenvalue 
problems of the type 


—u" (x) + q(a) u(@) = Au(x), O<a<1, (5.31a) 


u(0)=0, w'(1) + Hu(1) =0. (5.31b) 


Now the eigenvalues are zeros of the characteristic function 
fA) = uw(1,A,¢) + Hua(l,r,9), AEC. (5.32) 


For the special case, where q = 0, we have u2(a,A,0) = sin(VAx)/VA. The 
characteristic function for this case is then given by 


sin VA 

VX 
The zeros of f for g = 0 and H = 0 are Ayn = (n+ 1/2)?n?7, n = 0,1,2,... If 
H #0, one has to solve the transcendental equation zcot z+ H = 0. One can 


show (see Problem 5.2) by an application of the implicit function theorem in 
R? that the eigenvalues for g = 0 behave as 


g(A) = cosV\ + A 


An = (n+1/2)?x? + 2H + O(1/n). 


Lemma 5.7 is also valid because the boundary value problem is again self- 
adjoint. The Counting Lemma 5.9 now takes the following form: 


Lemma 5.13 Let q € L?(0,1) and N > 2 exp(|lq|lz) (1 + |H]) be an integer. 
Then we have 


(a) The mapping f(A) := u5(1,A,q) + A ue(1, A, ¢q) has exactly N zeros in the 
half-plane 
H:={XEC:Rer < N?n’}. 
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(b) f has exactly one zero in every set 
Um :={rA€EC: |VA— (m —1/2)n| < 1/2} 
provided m > N. 


(c) There are no other zeros of f in C. 


For the proof, we refer to Problem 5.3. We can apply the implicit function 
theorem to the equation 


ug(1,An(q),¢) + Hua(1,An(q),9) = 0 


because the zeros are again simple. Differentiating this equation with respect 
to q yields 


[wer An@) + Hua,x(1, dnd] Xa(@a 
+ ub g(1,An,4) + Huag(1,An,4) = 0. 


Theorem 5.6 yields 


: u(t)? dt 


u2,r(1) ua(1) — u5,,(1) ua(1) 


=— Hu2(1) 
= ~uo(1) [uy,(1) + Hue,a(1)] 


where again we have dropped the arguments Nn and q. Analogously, we compute 


— | a(tuatt)?at = -up(1) [uh (1) + Hus,(0)] 
0 


and thus 


uh g(1) + Hurg(l) _ fa alt)ua(t)?at 


An(Dd = ul, \(1) + H u2,(1) Jo wo(t)2at 


This has the same form as before. We continue as in the case of the Dirichlet 
boundary condition and arrive at Theorem 5.14. 


Theorem 5.14 Let Q C L?(0,1) be bounded, ¢q € Q, and H ER. The eigen- 
values ry, have the asymptotic form 


An = (n+5) wsaHs | a(t ar— f a(t) cos(2n + 1)rt dt+ O(1/n) (5.33) 
0 
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as n tends to infinity, uniformly in q € Q. For the L?-normalized eigenfunc- 
tions, we have 


gn(t) = V2sin(n+1/2)ra + O(1/n) and (5.34a) 
g(t) = V2(n+1/2)m cos(n+1/2)ra + O(1) (5.34b) 


uniformly for x € [0,1] andqgeQ. 


As mentioned at the beginning of this section, there are other ways to 
prove the asymptotic formulas for the eigenvalues and eigenfunctions that avoid 
Lemma 5.9 and the differentiability of A, with respect to q. But the proof in, 
e.g., [276], seems to yield only the asymptotic behavior 


1 
An = mar? + fait) )dt + O(1/n) 
0 


instead of (5.28). Here, (m,,) denotes some sequence of natural numbers. 

Before we turn to the inverse problem, we make some remarks concern- 
ing the case where q is complex-valued. Now the eigenvalue problems are no 
longer self-adjoint, and the general spectral theory is not applicable anymore. 
With respect to Lemma 5.7, it is still easy to show that the eigenfunctions 
corresponding to different eigenvalues are linearly independent and that the 
geometric multiplicities are still one. The Counting Lemma 5.9 is valid without 
restrictions. From this, we observe also that the algebraic multiplicities of A, 
are one, at least for n > N. Thus, the remaining arguments of this section are 
valid if we restrict ourselves to the eigenvalues A,, with n > N. Therefore, the 
asymptotic formulas (5.28), (5.29a), (5.29b), (5.33), (5.34a), and (5.34b) hold 
equally well for complex-valued q. 


5.4 Some Hyperbolic Problems 


As a preparation for the following sections, in particular, Sections 5.5 and 5.7, 
we study some initial value problems for the two-dimensional linear hyperbolic 
partial differential equation 


OW (a, t) 0? W (2, t) 
Ox? Ot? 


+ a(z,t)W(a,t) = 0, 


where the coefficient a has the special form a(x, t) = p(t)—q(a). It is well-known 
that the method of characteristics reduces initial value problems to Volterra inte- 
gral equations of the second kind, which can be studied in spaces of continuous 
functions. This approach naturally leads to solution concepts for nonsmooth 
coefficients and boundary data. We summarize the results in three theorems. 
In each of them, we formulate first the results for the case of smooth coeffi- 
cients and then for the nonsmooth case. We remark that it is not our aim to 
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relax the solution concept to the weakest possible case but rather to relax the 
assumptions only to the extent that are needed in Sections 5.5 and 5.7 and in 
Subsection 7.6.3. 

Although most of the problems—at least for smooth data—are subjects of 
elementary courses on partial differential equations, we include the complete 
proofs for the convenience of the reader. 

Before we begin with the statements of the theorems, we recall some function 
spaces 


Cool] == {f €C[0,1]: £0) = 9}, 
AYA). {fe C0, 1] : f(x) =a+ f g(t) dt, aER, se 20,1), 
Hg (0,1) := H*(0,1) NCoo(0, 1] 


and equip them with their canonical norms 


IIflloo = pas in Coo(0, 1], 
fle: = V/Ufllz2+1F'Z2 in H7(0,1) and Ag9(0, 1). 


The notations Coo[0, 1] and Ho (0,1) should indicate that the boundary condi- 
tion is set only at = 0. By || - ||ci for 7 > 1 we denote the canonical norm in 
C30, 1]. 
Furthermore, we define the triangular regions Ag C R? and A C R? by 
Ao := {(2,t)€R?:0<t<2<1}, (5.35a) 
A = 4@a) eR: i|<e<1}, (5.35b) 


respectively. We begin with an initial value problem, sometimes called the 
Goursat problem. 


Theorem 5.15 (a) Let p,q € C[0,1] and f € C?[0,1] with f(0) = 0. Then 
there exists a unique solution W € C?(Ao) of the following hyperbolic initial 
value problem: 


0? W (sz, t) 0?W(z,t) 


Ox? Ot? 
W(a,c) = f(x), O<a<l, (5.36b) 
W(x,0) = 0, 0<a#<1 (5.36c) 


(b) The solution operator (p,q, f) ++ W has an extension to a bounded operator 
from L?(0,1) x L?7(0,1) x Coo[0, 1] into C(Ao). 

(c) The operator (p,q, f) ++ (W(1,-),W.(1,-)) has an extension to a bounded 
operator from L?(0,1) x L?(0,1) x Hg9(0,1) into H'(0,1) x L*(0,1). Here and 
in the following, we denote by W, the partial derivative with respect to x. 


5.4 Some Hyperbolic Problems 191 


Proof: (a) First, we extend the problem to the larger region A and study the 
problem 


OW (za, t) OW (a, t) 


a2 BYE + a(z,t)W(a2,t) = 0 mA, (5.37a) 
W(a,c) = f(x), O<a<l, (5.37b) 
W(a,-x) = —-f(x), O0<a<1, (5.37c) 


where we have extended p(t) — q(x) to a(x, t) := p(|t|) — q(x) for (x,t) € A. 
To treat problem (5.37a)—(5.37c), we make the change of variables 


c= ot Yao =n: 
Then (a,t) € A if and only if (€,7) € D, where 
D:= {(€,n) € (0,1) x (0,1): n+€ <1}. (5.38) 


We set w(€,7) := W(€+n, €—n) for (€,7) € D. Then W solves problem (5.37a)- 
(5.37c) if and only if w solves the hyperbolic problem 


ABE — alg n=n) wn), (Em ED. (5.300) 
=:a(£,7) 

w(€,0) = f(&) for € € [0,1], (5.39b) 

w(0,n) = —f(n) for 7 € [0,1]. (5.39c) 


Now let w be a solution of (5.39a)—(5.39c). We integrate the differential equation 
twice and use the initial conditions. Then w solves the integral equation 


n € 
w(én) = i] / a(e'q) wl’) dear! — fn) + FO, (5.40) 
0 O 


for (€,7) € D. This is a Volterra integral equation in two dimensions. We 
use the standard method to solve this equation by successive iteration in C(D) 
where we assume only p,q € L?(0,1), and thus @ € L?(D). Let A be the Volterra 
integral operator defined by the integral on the right-hand side of (5.40). By 
induction with respect to n € N, it can easily be seen (compare with the proof 
of Theorem 5.18 below for a similar, but more complicated estimate) that 


1 
|(A°w) (En)| S elloo halle = (En)"”?, n=1,2,...; 


thus ||A”w]loo < ||w]loo ||@||Z2 4. Therefore, ||A"||c(co,1)) < 1 for sufficiently 


ni? 
large n, and the Neumann series converges (see Appendix A.3, Theorem A.31). 
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This proves that there exists a unique solution w € C(D) of (5.40). From our 
arguments, uniqueness also holds for (5.37a)—(5.37c). 

Now we prove that the solution w € C(D) is even in C?(D). Obviously, 
from (5.40) and the differentiability of f, we conclude that w is differentiable 
with partial derivative (with respect to €, the derivative with respect to 7 is 
seen analogously) 


we(E,n) = i [aé +1) —v(lE—n'D] wen) ay + 7° 


a 
= / o(s) w(E,s—é)ds — | pl|s|)w(g,€—s)ds + FO). 
€ 


3m 


g 


This second form can be differentiated again. Thus w € C?(D), and we have 
shown that W is the unique solution of (5.37a)—(5.37c). 

Because a(,-) is an even function and the initial data are odd functions with 
respect to t, we conclude from the uniqueness result that the solution W(z, -) 
is also odd. In particular, this implies that W(«,0) = 0 for all x € [0,1], which 
proves that W solves problem (5.36a)—(5.36c) and finishes part (a). 

Part (b) follows immediately from the integral equation (5.40) because the 
integral operator A : C(D) + C(D) depends continuously on the kernel @ € 


L?(D). 
For part (c), we observe that 
wi, 2€ ~~ 1) = w(E, 1— f) ’ thus 
d d 
We(1,2€—1) = Fue(1-€) + Zun(&1-8). 
We have computed we¢ already above and have 
1 g 
we(G1-6) = fals)w(es—e)ds — f pis) w&g-s)as + 7°) 
g 2€-1 


which is in L?(0,1) for p,q € L7(0,1) and f € Hj,(0,1). An analogous formula 
holds for w, and shows that W(t,-) € H'(0,1) and W,(1,-) € L£°(0,1) for 
p,q € L7(0,1) and f € Hj,(0,1). This ends the proof. 


Remark 5.16 (a) If p,q € L7(0,1) and f € C[0,1] with f(0) = 0, we call the 
solution W € C(Ao), given by 
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where w € C(D) solves the integral equation (5.40), the weak solution of the 
Goursat problem (5.36a)—(5.36c). We observe that for every weak solution W 
there exist sequences (pn), (qn) in C[0,1] and (fn) in C?[0,1] with fr(0) = 0 
and ||\ppn —p||z2 3 9, ||dn —q||z2 3 0 and || fn— fll. 2 0 such that the solutions 
W,, € C?(Ao) of (5.36a)-(5.36c) corresponding to Pn, dn, and fp converge 
uniformly to W. 


(b) We observe from the integral equation (5.40) that w has a decomposition 
into w(€,n) = wi(E,n) — f(n) + f(€) where wy € C1(D) even if only p,q € 
L7(0,1). This transforms into W(«,t) = Wi(2x,t) — f($(@—t)) + f(§(@ +0) 
with Wy € C1(A). 


For the special case p = q = 0, the integral equation (5.40) reduces to the 
well-known solution formula 


W(a,t) = i(F (0 t)) 


The next theorem studies a Cauchy problem for the same hyperbolic differential 
equation. 


Theorem 5.17 (a) Let f € C7{0,1], g € C'[0,1] with f(0) = f”(0) = g(0) =0, 
and p,q € C[0,1] and F € C1(Ao). Then there exists a unique solution W € 
C?(Ao) of the Cauchy problem 


O?W (za, t) 0?W (a, t) 


+ (p(t)—9(x)) W(a,t) = F(a,t) in Ao, (5.41a) 


Ox? Ot? 
W(1,t) = f(t) for0<t<1, (5.41b) 
2 wat = g(t) forO<t<1. (5.41c) 


(b) Furthermore, the solution operator (p,q, F, f,g) ~~ W has an extension to a 
bounded operator from L?(0,1) x L?(0,1) x L?(Ao) x Héo(0,1) x L7(0,1) into 
C(Ao). 

Proof: As in the proof of Theorem 5.15, we set a(x,t) := p(|t|) — q(x) 
(x,t) € A and extend F to an even function on A by F(a,-t) = Ea ot 
(a,t) € Ao. We also extend f and g to odd functions on [—1,1] by f(-t 
—f(t) and g(—t) = —g(t) for ¢ € [0,1]. Then F € C’(A \ (0, 1] x {0})) N C(D), 
f € C?[-1,1], and g € C*[-1,1]. We again make the change of variables 


e=€+n, t=€-n, w(é,n)=W(E+n,€—n) for (f,n) € D, 
where D is given by (5.38). Then W solves (5.41a)—(5.41c) if and only if w 


solves 
Pw(E,n) 


BEA WE w(En) + F(é,n), (én) €D 
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where F(E,) = F(E+n,€—) and a(€,9) = —a(é+n,€—n) = a(E+n)—p(|E—nl. 
The Cauchy conditions (5.41b) and (5.41c) transform into 


w(,l—€) = f(2—1) and we(f,1— €) + wy(f,1— €) = 29(2€ — 1) 
for0 < € < 1. Differentiating the first equation and solving for we and wy yields 
we(E,1-€) = g(2€-1) + f'(QE-1) and w,(€,1—€) = g(2€-1) - f'(2€-1) 
for 0 < € < 1. Integration of the differential equation with respect to € from & 
to 1— 7 yields 

1-n 
— f [ale'sn) wen) + FE] ae’ + 91 = 20) - FA = 20). 
& 


Ow(E,7) _ 
On 7 


Now we integrate this equation with respect to 7 from 7 to 1 — € and arrive at 
1—€ 1-1’ 


wry. = ff ea vete ad + Ae a0) a a (5.42) 


= 
- f 92-27) )dn! 4 5 F(2g 1) 4 ait 21) 
” 


for (€,7) € D. This is again a Volterra integral equation in two variables. The 
solution w € C(D) is in C?(D). Indeed, since it is obviously differentiable 
we take the derivative with respect to 7 and arrive at the formula above and, 
after substitution of a(é’,7) and making the change of variables s = €’ — 7 and 
s = €'+7, respectively, at the representation 


mae Jf rbcducrnnar | [aome—nalla 


An analogous formula holds for the derivative with eae to €. We can differ- 
entiate again because the function w(€, 7) = fe Te F(€ n) d€' is differentiable in 


~ although F is not differentiable at the line € = 7. a reader should try to 
prove this. If only p,q € L7(0,1), F € L?(Ao), f € Hgo(0, 1), and g € L7(0,1) 
then (5.42) defines the weak solution. 


Let A denote the integral operator 


(Aw)(En) = ff ae'sn!) w(E'sn) del drf, (Gn) € D 
” € 
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By induction, it is easily seen that 


1 n 
|(A”w)(E,7)| < ales Ia\|r2 Jean)! ( Si= n) 
for all (€,7) € D and n EN; thus 
n ~ In 1 
|A™wlloo < [lwlloo lallz2 Jon 


for alln € N. For sufficiently large n, we conclude that ||A”||c(cjo,11) < 1, which 
again implies that (5.42) is uniquely solvable in C(D) for any p,q,g € L7(0,1), 
F € L*(Ao), and f € Hdg(0, 1). 


For the special case p = gq = 0 and F = OQ, the integral equation (5.42) 
reduces to the well-known d’Alembert formula 


t+(1-2) 


Ww = -; / andes = F(e+( z)) + f(e-(—2)). 


t—(1-2) 


Finally, the third theorem studies a quite unusual coupled system for a pair 
(W,r) of functions. We treat this system with the same methods as above. 


Theorem 5.18 (a) Let ¢ € C[0,1], F € C1(Ao), f € C?[0, 1], and g € C [0,1] 
such that f(0) = f"(0) = g(0) =0. Then there exists a unique pair of functions 
(W,r) € C?(Ao) x C1[0, 1] with 


0?W (a, t) 0?W (x,t) 


g(a) W(a2,t) = F(a,t)r(a) in Ag, (5.48a) 


Ox? Ot? 
1 zx 
W(a,z) = 5 |r) ds, O<a<l, (5.43b) 
0 
Wi(e,0) = 0, 0<2<1, (5.43c) 
and 
W(1,t) = f(t) and Swat) = g(t) forall t € [0,1]. (5.43d) 


(b) Furthermore, the solution operator (q, F, f,g) > (W,r) has an extension to 
a bounded operator from L?(0,1) x C(Ao) x Hg (0,1) x L7[0,1] into C(Ao) x 
L7(0,1). 
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Proof: (a) We apply the same arguments as in the proofs of Theorems 5.15 and 
5.17. We extend F(a,-) to an even function and f and g to odd functions. 
We again make the change of variables c = € +7 and t = € — 7» and set 
F(€,n) = F(€é+7,€—7). In Theorem 5.17 (for p = 0), we have shown that 
the solution W of the Cauchy problem (5.43a) and (5.43d) is equivalent to the 
integral equation 


1—€ 1-7 


- f fits (E+ nf) wll a!) + BE a!) re + 91)] ae! ae 
n € 


1-€ 
— | x (1 — 2m) dn 4 nf +5 £26 1)4 ait 2n)  (5.44a) 
n 


for w(€,n) = W(E+7,€ — 7) (see equation (5.42)). From this and the initial 
condition (5.43b), we derive a second integral equation. We set 7 = 0 in (5.44a), 
differentiate, and substitute (5.43b). This yields the following Volterra equation 
after an obvious change of variables: 


| 
3 
is 
8 
ae 
II 
| 
‘2 
— 
a 
g 
— 
8 
H 
| 
hee 
ee 
Zr 


(x,s —a)] ds 


+ g(2e—1) + f’(2e-1). (5.44b) 


Assume that there exists a solution (w,r) € C(D) x L7(0,1) of (5.44a) and 
(5.44b). From (5.44b), we observe that r is continuous on [0,1], and thus by 
(5.44a), w € Cl(D) and thus also r € C10,1] because the function u(r) = 
is r(s) F(«,s — x) ds is differentiable on [0,1]. Therefore, the right-hand side 
of (5.43a) is differentiable and we conclude as in the previous theorem that 
w € C?(D). Furthermore, 4 W(a,x) = 4 w(a,0) = $r(x). Now, because 
F(a,-) is even and f and g are odd functions, we conclude that W(z,-) is also 
odd. In ana me 0) = 0 for all x € [0,1]. This implies W(0,0) = 0 and 
thus W(z,2) = 5 s fo r( s)ds. Therefore, we have shown that every solution of 
equations (5.44a) and (5 poe satisfies (5.43a)- (5.43d) and vice versa. 

Now we sketch the proof that the system (5.44a), (5.44b) is uniquely solvable 
for (w,r) € C(D) x L?(0,1) for given g € L7(0,1), F € C(Ao), f € Hé,(0, 1), 
and g € L?(0,1). This would also include the proof of part (b). We write this 
system in the form (w,r) = A(w,r) + in the product space C(D) x L?(0,1) 
which we equip with the norm ||(w,1)|].0,22 = max{||w||o0, || ||z2(0,1) }. To apply 
the fixed point theorem we define first the constant c := 2[||q||12¢,1) + ||F'll-o| 
and, for given (w,r) € C(D) x L?(0,1), the functions (wn,rn) = A"(w,r). By 
induction we prove the following estimates 

n/2 


lwn(En)| < I(w.Mleue Fa (2 €=") , &neD 


ayaa’, x € (0,1), 


lrn(z)| << [(,r)Mloo,n2 
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for alln = 1,2,.... 


We use the elementary integral a rags — €& — a) "dé'dy = 
CRERICES (1 g ) alee for n = 0, 1, ... and set a(&,n) = ag + n) and ré; ) 


= r(€ +7) for abbreviation. We note that ||@||Z2(p) = Jp la(é + m)I?(E,n) = 
1 : 
Jo tla(t)|Pdt < IlallZ2(0,1) and analogously ||7||,2(p) < ||rl|z2(p)- 


For n = 1 we have by the Cauchy-Schwarz inequality 
1—€ 1-7’ 1-—£ 1-7 
lei(én)| << [lelloo / ; \a(E.n)| dé! dy! + ||Flloc / / lF(E,n)| dé? dn! 
n &€ n & 
1—€ 1-7’ 
< |l(w,r)lloo.x2 [llallz2~) + llFlleo] i / dé? dr 
n € 
1 
<= ell(w,r) laze i: (l=€=—9) S eerie 7217/1 —§ —2, 
|r(z)| < 2[|!wloollallz2¢,1) + IIrllz2(0,||F loo] V1 — & 
< 2|\(w,r)|I.0,r2 [Ilallz20,1) + IF leo] V1 — & 
<= ell(w,r)lloze V1—2. 


The step from n to n+1 is proven in just the same way. Therefore, ||A”||o,12 < 
val sa 
yields that (5.44a), (5.44b) has a unique solution in C(D) x L7(0,1) for all 


q € 1°(0,1), F € C(Ao), f € H4p(0, 1), and g € 17(0,1). 


which tends to zero as n tends to infinity. Application of Theorem A.31 


5.5 The Inverse Problem 


Now we study the inverse spectral problem. This is, given the eigenvalues \,, 
of the Sturm—Liouville eigenvalue problem 


—u" (x) + q(x) u(x) =Au(z), O0<a<1, u(0)=0, u(1)=0, (5.45) 


determine the function g. We saw in Example 5.1 that the knowledge of the 
spectrum {X,, : n € N} is, in general, not sufficient to determine g uniquely. We 
need more information, such as a second spectrum py, of an eigenvalue problem 
of the form 


—v" (x) + q(x) v(x) = pv(x), v(0) =0, v’(1) + Hv(1) = 0, (5.46) 


or some knowledge about the eigenfunctions. 

The basic tool in the uniqueness proof for this inverse problem is the use 
of the Gelfand—Levitan—Marchenko integral operator (see [101]). This integral 
operator maps solutions of initial value problems for the equation —u”+qu = Au 
onto solutions for the equation —u” + pu = Au and, most importantly, does not 
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depend on X. It turns out that the kernel of this operator is the solution for the 
hyperbolic boundary value problem that was studied in the previous section. 


Theorem 5.19 Let p,q € L7(0,1), AEC, and u,v € H7(0,1) be solutions of 


—u" (x) + q(x) u(x) =Au(r), O<ar<l, u(0) =0, (5.47a) 


—v'" (x) + p(x) u(r) = Av(xz), O<ar<l, v(0) =0, (5.47b) 


such that u’(0) = v'(0). Also let K € C(Ao) be the weak solution of the Goursat 
problem 


OK(x,t) 2K (a,t) 


+ (p(t) —q(x)) K(x,t) = 0 inAo,  (5.48a) 


Ox? Ot? 
K(x,0) = 0, 0<2<1, (5.48b) 
K(a,2) = 5 | (als) —p(s))ds, O<e<1, (5.48c) 


where the triangular region Ap is again defined by 


Ao := {(2,t) €R?:0<t<2<1}. (5.49) 
Then we have 
ie. =4G)+ / Ka duOd, Drei, (5.50) 
0 
We remark that Theorem 5.15 with f(x) = $ > (a(s) — p(s)) ds implies that 


this Goursat problem is uniquely seer in the ae sense. 


Proof: First, let p,q € C[0,1]. Then K € C?(Ag) by Theorem 5.15. Define w 
by the right-hand side of (5.50); that is, 


w(x) : y+ [xeon t)dt forO<a<1. 
Then w(0) = v(0) = 0 = u(0) and w is differentiable with 


w (a) = v(x) + K(a,x)v y+ fe z,t)o(t)dt, O<a<l. 
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Again, we denote by K,, K;z, etc., the partial derivatives. For « = 0, we have 
w'(0) = v'(0) = u’(0). Furthermore, 


w(x) = v(x) + v(0) <K (2,2) + K(a,x) v'(x) 


+ K,(a,x) v(x) + [ Keele, o dt 
0 


D(a) —rA+ £K(2,2) + K,(2, 2) u(x) + K(a,x) v'(x) 


‘ / [(a(e) — p(t)) Ke, t)u(t) + Ku(a, 1) v(t)] at. 


0 


Partial integration yields 


/ Kate Dud 
0 


t=x 


7 / K(a,t)v"(t)dt + [Ki(e,t)o(t) — K(a,t) oO) 
10) 


x 


= / (p(t) — d) K(a,t) v(t) dt + Ky(2,2) (a) — K(2,2)0"(2). 
0 


Therefore, we have 


w(x) = (a) —A+ “Ke, xv) + K,(a,x) + K(x, 2)] (2) 
a 


=24, K(«,0)=4(x)—p(x) 


(q(x) — r) v(a) + [Keo dt} = (q(x) - r) w(a) ; 
0 


that is, w solves the same initial value problem as u. The Picard—Lindelof 
uniqueness theorem for initial boundary value problems yields w = u. Thus, we 
have proven the theorem for smooth functions p and q. 

Now let p,q € L?(0,1). Then we choose functions (pp), (dn) in C[0, 1] with 
Pn > pand gq, — q in L?(0,1), respectively. Let K,, be the solution of (5.48a)— 
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(5.48c) for py, and gn. We have already shown that 


Un(@) = Un(a) + | Kole,t)on(t) at, O0<a<l, 
0 


for all n € N, where u, and v, solve (5.47a) and (5.47b), respectively, with 
ul, (0) = v/,(0) = w’(0) = v’(0). From the continuous dependence results of 
Theorems 5.6 and 5.15(b), the functions uy, vn, and K,, converge uniformly 
to u, v, and K, respectively. This proves the assertion of the theorem for 
p,q € L?(0,1). 


As an example, we take p = 0 and v(x) = sin(V\x)/VX and have the 
following result: 


Example 5.20 
Let wu be a solution of 


—u" (x) + g(a) u(x) =Au(x), u(0)=0, w(0)=1, (5.51) 


for given q € L?(0,1). Then we have the representation 


u(x) = — ES [Roo eM at O<2<1, (5.52) 
where the kernel K solves the following Goursat problem in the weak sense: 
Kag(a,t) — Ky(a,t) — g(a) K(a,t) = 0 in Ag, (5.53a) 
K(z,0) = 0, O<2#<l, (5.53b) 
K(a,2) = 5 | asvas, O<a<l. (5.53c) 
0 


This example has an application that is interesting in itself but that we also 
need in Section 5.7 and later in Subsection 7.6.3 


Theorem 5.21 Let 4, be the eigenvalues of one of the eigenvalue problems 
(5.45) or (5.46) where again q € L?(0,1). Then the set of functions {sin(WXn°) : 
n € N} is complete in L?(0,1). This means that ie h(x) sin /Anz dx = 0 for 
alln € N implies that h = 0. 


Proof; Let T : L?(0,1) — L?(0,1) be the Volterra integral operator of the 
second kind with kernel K; that is, 


(Tv)(x) := v(x) + [enue dt, x €(0,1), v€ L7(0,1), 
0 
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where K solves the Goursat problem (5.53a)—(5.53c) in the weak sense. Then 
we know that T is an isomorphism from L?(0,1) onto itself. Define vp(x) := 
sin /A,x for x € [0,1], n € N. Let up, be the eigenfunction corresponding to 
An; normalized to uj,(0) = 1. By the preceding example, 


1 
Un = —=Tvp, or Un = VAn T Un. 
VXn 


Now, if ie h(x) vn (x) dx = 0 for all n € N, then 
1 1 
0 = pro Ty, (o)\de = [ vla) (T*)'h(a) dx for allneN, 
0 0 


where T* denotes the L?-adjoint of T. Because {un/||un|| 2 : n € N} is complete 
in L7(0,1) by Lemma 5.7, we conclude that (T*)~'h = 0 and thus h = 0. 


Now we can prove the main uniqueness theorem. 


Theorem 5.22 Let H ER, p,q € L7(0,1), and An(p), An(q) be the eigenvalues 
of the eigenvalue problem 


—y"' +ru= AU an (0, 1), u(0) = 0, u(1) =0, 


corresponding to r = p and r = q, respectively. Furthermore, let fun(p) and 
Lin(q) be the eigenvalues of 


—y" + ru= LU am (0, 1), u(0) = 0, w'(1) + Hu(1) =0, 


corresponding tor = p and r = q, respectively. 
Tf An(p) = An(q) and Un(p) = Un(q) for alln EN, then p = q. 


Proof: From the asymptotics of the eigenvalues (Theorem 5.11), we conclude 
that 


An(p) = nen? + p(t)dt + o(1), no, 


and thus 


, (p(t) — a(t) dt = lim (An(p)—An(q)) = 0. (5.54) 


n—+0o 
0 


Now let K be the weak solution of the Goursat problem (5.48a)—(5.48c). 
K depends only on p and q and is independent of the eigenvalues \,, := An(p) = 
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An(q) and fn := fn(p) = Un(q). Furthermore, from (5.54), we conclude that 
K(1,1) =0. 

Now let wn, vpn be the eigenfunctions corresponding to A,(q) and A,(p), 
respectively; that is, solutions of the differential equations 


—uy (2) + g(x) Un(t) = AnUn(z), —On(x) + p(x) Un(Z) = An Un(x) 
for 0 < « < 1 with homogeneous Dirichlet boundary conditions on both sides. 
Furthermore, we assume that they are normalized by u/,(0) = vj,(0) = 1. Then 
Theorem 5.19 is applicable and yields the relationship 


site gee i Reda: tere oi), (5.55) 
and all n € N. For x = 1, the boundary conditions yield 
1 


a [Rao lak Toran, (5.56) 


Now we use the fact that the set {v,/||Un||z2 : n € N} forms a complete orthonor- 
mal system in L?(0,1). From this, K(1,t) = 0 for all ¢ € [0,1] follows. 

Now let wu, and v, be eigenfunctions corr pensine to Mf, and gq and p, 
respectively, with the normalization w/,(0) = v/,(0) = 1. Again, Theorem 5.19 
is applicable and yields the relationship (5.55) for &, and @, instead of u,, und 
Un, respectively. Assume for the moment that K is differentiable. Then we can 
differentiate this equation, set x = 1, and arrive at 


0 = (1) — 6,01) + A[an(1) —o,(1)] 
= —< On(1) + flew + 2x0 Bn (t) dt. 


We conclude that Hee (1, t) o,(t) dt = 0 for alln € N. From this, K,(1,t) = 0 
for all t € (0,1) hae because {@p/||®n||z2} forms a complete orthonormal 
system. However, K is only a weak solution since p,q € L?(0,1). From part (b) 
of Remark 5.16 we know that K has the form K(z,t) = Ki(z,t) — f($(x — 
t)) + AG (x : t)) with K € C'(Ao) where in the present situation f(z) = 
$ fo @ p(s)) ds is in Hg, (0,1). Then one can sa prove (approximation 
of f . esi Ee that A = au 1] where (x) = fo f($(x £2) On(t) dt 
and w'(x) = f($(e@+2)) tn(z) + 4 fpf’ (S(w@£t)) 6 a ) dt. Therefore, one can 
argue as in the smooth case of K, 

Now we apply the uniqueness part of Theorem 5.17 (in particular, the inte- 
gral equation (5.42) for f = g = 0 and F = 0) which yields that K has to vanish 
identically. In particular, this means that 


0 = K(a,r) = 5 [ee —q(s))ds for all x € (0,1). 
0 
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Differentiating this equation yields that p = q. 


We have seen in Example 5.1 that the knowledge of one spectrum for the 
Sturm-Liouville differential equation is not enough information to recover the 
function q uniquely. Instead of knowing the spectrum for a second pair of 
boundary conditions, we can use other kinds of information, as the following 
theorem shows: 


Theorem 5.23 Let p,q € L7(0,1) with eigenvalues X,(p), An(q), and eigen- 
functions Un and vpn, respectively, corresponding to Dirichlet boundary condi- 
tions u(0) = 0, u(1) = 0. Let the eigenvalues coincide; that is, An(p) = An(q) 
for alln EN. Let one of the following assumptions also be satisfied: 


(a) Let p and q be even functions with respect to 1/2; that is, p(1 — x) = p(x) 
and q(1 — x) = q(a) for all x € [0,1]. 


(b) Let the Neumann boundary values coincide; that is, let 


un(l) _ Yn (1) 
ut (0) = vt (0) for alineN. (5.57) 


Then p = q. 


Proof: (a) From Theorem 5.7, part (e), we know that the eigenfunctions u,, and 
Un, again normalized by u/,(0) = v/,(0) = 1, are even with respect to « = 1/2 
for odd n and odd for even n. In particular, u/,(1) = vj,(1). This reduces the 
uniqueness question for part (a) to part (b). 

(b) We follow the first part of the proof of Theorem 5.22. From (5.56), we 
again conclude that K(1,¢) vanishes for all t € (0,1). The additional assump- 
tion (5.57) yields that uj,(1) = vi,(1). We differentiate (5.55), set « = 1, 
and arrive at fo K,(1,t)vn(t) dt = 0 for alln € N. Again, this implies that 
K,(1,-) = 0, and the proof follows the same lines as the proof of Theorem 5.22. 


5.6 A Parameter Identification Problem 


This section and the next two chapters are devoted to the important field of 
parameter identification problems for partial differential equations. In Chap- 
ter 6, we study the problem of impedance tomography to determine the conduc- 
tivity distribution from boundary measurements, while in Chapter 7, we study 
the inverse scattering problem to determine the refractive index of a medium 
from measurements of the scattered field. In the present section, we consider 
an application of the inverse Sturm—Liouville eigenvalue problem to the follow- 
ing parabolic initial boundary value problem. First, we formulate the direct 
problem: 

Let T > 0 and Qr := (0,1) x (0,7) C R?, gq € C[0,1] and f € C?[0,T) 
be given with f(0) = 0 and q(x) > 0 for x € [0,1]. Determine U € C(Q7r), 
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which is twice continuously differentiable with respect to x and continuously 
differentiable with respect to ¢ in Qr such that 0U/Ox € C (Qr) and 


OU.) = a — q(a)U(a,t) nQr, (5.58a) 
U(e,0) = 0, #€ (0,1, (5.58b) 
U(0,t) = 0, -ua,t) = f(t), t€(0,T7). (5.58c) 


From the theory of parabolic initial boundary value problems, it is known that 
there exists a unique solution of this problem. We prove uniqueness and refer 
to [170] or (5.60) for the question of existence. 


Theorem 5.24 Let f =0. Then U = 0 is the only solution of (5.58a)—(5.58c) 
im Qr. 


Proof: Multiply the differential equation (5.58a) by U(x,t) and integrate with 
respect to x. This yields 


a) aoe = {Fae iad ata u(t]. 


We integrate by parts and use the homogeneous boundary conditions: 


few) sore se 


This implies that t 1 i; U(x, t)?dzx is nonnegative and monotonically nonin- 


creasing. From J, U(x,0)? dx = 0, we conclude that i. U(a, t)? dx = 0 for all 
t; that is, U = 0. 


Now we turn to the inverse problem. Let f be known and, in addition, 
U(1,t) for allO0 <t < T. The inverse problem is to determine the coefficient q. 
In this section, we are only interested in the question if this provides sufficient 
information in principle to recover g uniquely; that is, we study the question of 
uniqueness of the inverse problem. It is our aim to prove the following theorem: 


Theorem 5.25 Let Ui, U2 be solutions of (5.58a)—(5.58c) corresponding to q = 
qm > 0 and q= q > 0, respectively, and to the same f € C*[0,T] with f(0) =0 
and f'(0) £0. Let U,(1,t) = U2(1,t) for allt € (0,T). Then q, = gg on [0,1]. 
Proof: Let (q,U) be (q1,U1) or (q2,U2), respectively. Let A, and gn, n € 


N, be the eigenvalues and eigenfunctions, respectively, of the Sturm—Liouville 
eigenvalue problem (5.46) for H = 0; that is, 


—u" (x) + q(x) u(z) =Au(x), O<a <1, u(0) = u’(1) = 0. 
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We assume that the eigenfunctions are normalized by ||g,||z2 = 1 for all n € N. 
Furthermore, we can assume that g,(1) > 0 for all n € N'. We know that 
{gn : n € N} forms a complete orthonormal system in L?(0,1). Theorem 5.14 
implies the asymptotic behavior 


An = (n+1/2)?n? + @ +n with SoM <co, (5.59a) 


n=1 


gr(x) = V2sin(n+1/2)mx + O(1/n), (5.59b) 


where g = i q(x) da. In the first step, we derive a series expansion for the 
solution U of the initial boundary value problem (5.58a)—(5.58c). From the 
completeness of {g, : 2 € N}, we have the Fourier expansion 


U(a,t) = S- an(t) gn(z) with a,(t) = fue n(x) dz, nEN, 
i=. 0 


where the convergence is understood in the L?(0,1)-sense for every t € (0,7). 
We would like to substitute this into the differential equation and the initial and 
boundary conditions. Because for this formal procedure the interchanging of 
summation and differentiation is not justified, we suggest a different derivation 
of an. We differentiate a,, and use the partial differential equation (5.58a). This 
yields 


1 


a) =f PD a (e)dr= f PS? - ante Pane 
0 


Ot Ox? 
0) 
= [ot GEO — vena] 


" / U (a, t) [g!(2) — a2) gn(2)] de 
0 


=—Angn(x) 


I 


F(t) gn(1) — Anan(t) - 
With the initial condition a,,(0) = 0, the solution is given by 


t 
an(t) = g(t) f #(r) oO? ar 
0 
that is, the solution U of (5.58a)—(5.58c) takes the form 


U(2,t) = S~an(1) gn(2) / P(r) ena, (5.60) 
n=1 0 


1gn(1) = 0 is impossible because of g/,(1) = 0 
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From partial integration, we observe that 


r t 
| F(r)e "dr = ~F0 7 x i f(r) ePe—Medr, 
0 


0 


and this decays as 1/A,. Using this and the asymptotic behavior (5.59a) and 
(5.59b), we conclude that the series (5.60) converges uniformly in Qr. For 
x = 1, the representation (5.60) reduces to 


Y n(t)? | I(r) ede 
n=1 


U(1,t) 


7 [r 1S atl 1)2et-9 dr, t € [0,7]. 


ee 
=: A(t—T) 


Changing the orders of integration and summation is justified by Lebesgue’s 
theorem of dominated convergence. This is seen from the estimate 


oo oo 
_ 8 - C 

S© gn(1)? ne eye i a ae —e 

n=1 n=1 0 Vv 


and the fact that the function s ++ 1/,/s is integrable in (0, T]. 

Such a representation holds for U;(1,-) and U2(1,-) corresponding to q; and 
gz, respectively. We denote the dependence on gq; and q2 by superscripts (1) 
and (2), respectively. From U;(1,-) = U2(1,-), we conclude that 


- frmare —7)- AQ (t—7 r= frou AM (r) — A@ (r)] dr; 
0 


that is, the function w := A“) — A) solves the homogeneous Volterra integral 
equation of the first kind with kernel f(t — rT). We differentiate this equation 
twice and use f(0) = 0 and f’(0) 4 0. This yields a Volterra equation of the 
second kind for w: 
t 
f’(0) w(t) + fre- s)w(s)ds = 0, te (0,7). 
0 


Because Volterra equations of the second kind are uniquely solvable (see Exam- 
ple A.32 of Appendix A.3), this yields w(t) = 0 for all ¢, that is 


co 


Dilger = SP foP()]?e" for all ¢ € (0,7). 


n=1 n=1 
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We note that gf (1) > 0 for 7 = 1,2 by our normalization. Now we can apply 
a result from the theory of Dirichlet series (see Lemma 5.26) and conclude that 
AY = 2) and gs (1) = g? (1) for all n € N. Applying the uniqueness result 
analogous to Theorem 5.23, part (b), for the boundary conditions u(0) = 0 and 
u’(1) = 0 (see Problem 5.5), we conclude that gq, = qo. 


It remains to prove the following lemma: 


Lemma 5.26 Let X,, and 1, be strictly increasing sequences that tend to infin- 


ity. Let the series 
co co 
S- ane >? and S- Bre tn® 
n=1 n=1 


converge for every t € (0,T] and uniformly on some interval [6,T]. Let the 
limits coincide, that is 


Co 


+ net = S- Bre "n* for allt € (0,T). 
n=1 


n=1 


If we also assume that a, 4 0 and 8, #0 for alln EN, then an = Bn and 
An = Un for alln EN. 


Proof: Assume that A, # py or a, # 31. Without loss of generality, we can 
assume that fu; > A1 (otherwise, interchange the roles of X,, and ju;,). Define 


Cr(t) = ane Amt _ Bre(Hn—At for £> 8. 


By analytic continuation, we conclude that >>~_, C,,(t) = 0 for all ¢ > 6 and 
that the series converges uniformly on [0,00). Because 


Ci(t) = ay — Bye 1->vt 


and a; # (3; or 41 > A; there exist € > 0 and t; > 6 such that |Ci(t)| > € for 
all t > t;. Choose ng € N with 


no 


enC) < 5 for allt > ty. 
n=1 


Then we conclude that 


no 


Cal = Jew ~ Yen > Ici) — 


for all t > t,. Now we let ¢ tend to infinity. The first finite sum converges to 
zero, which is a contradiction. Therefore, we have shown that A; = p, and 
a, = B,. Now we repeat the argument for n = 2, etc. This proves the lemma. 
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5.7 Numerical Reconstruction Techniques 


In this section, we discuss some numerical algorithms for solving the inverse 
spectral problem which was suggested and tested by W. Rundell, P. Sacks, and 
others. We follow closely the papers [186, 233, 234]. 

From now on, we assume knowledge of eigenvalues X,, and u,, n € N, of the 
Sturm-Liouville eigenvalue problems (5.45) or (5.46). It is our aim to determine 
the unknown function qg. Usually, only a finite number of eigenvalues is known. 
Then one cannot expect to recover the total function g but only “some portion” 
of it (see (5.62)). 

The first algorithm we discuss uses the concept of the characteristic function 
again. For simplicity, we describe the method only for the case where gq is known 
to be an even function; that is, q(1 — x) = q(x). Then we know that only one 
spectrum suffices to recover q (see Theorem 5.23). 

Recalling the characteristic function f(A) = ug(1,A,q) for the problem 
(5.45), the inverse problem can be written as the problem of solving the equa- 
tions 


WN 


ua(1,An,q) = 0 forallneN (5.61) 


for the function qg. If we know only a finite number, say \,, for n = 1,...,N, 
then we assume that q is of the form 


N 
q(x; a) = S- an Gal@) » %&E (0, 1) , (5.62) 
n=1 
for coefficients a = (a1,...,an) € R% and some given linear independent even 


functions gn. If gq is expected to be smooth and periodic, a good choice for qn 
is qn(x) = cos(2a(n — 1)x), n = 1,...,N. Equation (5.61) then reduces to the 
finite nonlinear system F(a) = 0, where F : RN + R% is defined by 


F,(a) := ue(1,An,q(3a)) foraeR® andn=1,...,N. 


Therefore, all of the well-developed methods for solving systems of nonlinear 
equations can be used. For example, Newton’s method 

aft) — gl) — F(a)" F(a), k=0,1,..., 
is known to be quadratically convergent if F’(a)~! is regular. As we know from 
Section 5.2, Theorem 5.6, the mapping F' is continuously Fréchet differentiable 
for every a € R“. The computation of the derivative is rather expensive, and 
in general, it is not known if F’(a) is regular. In [186], it was proven that F’(a) 
is regular for sufficiently small a and is of triangular form for a = 0. This 
observation leads to the simplified Newton method of the form 


Pg BO) la RSD Tes 2 


For further aspects of this method, we refer to [186]. 
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Before we describe a second algorithm, we observe that from the asymptotic 
form (5.28) of the eigenvalues, we have an estimate of G = Jo a x) dx. Writing 
the differential equation in the form 


we observe that we can assume without loss of generality that So 4 x) dx = 0. 


Now we describe an algorithm that follows the idea of the uniqueness Theo- 
rem 5.22. We allow q € L?(0,1) to be arbitrary. The oe consists of two 
steps. First, we recover the Cauchy data f = K(1,-) and g = K,(1,-) from the 
two sets of eigenvalues. Then we suggest Newton-type methods to compute q 
from these Cauchy data. 

The starting point is Theorem 5.19 for the case p = 0. We have already for- 
mulated this special case in Example 5.20. Therefore, let (An, Un) be the eigen- 
values and eigenfunctions of the eigenvalue problem (5.45) normalized such that 
u},(0) =1. The eigenvalues \,, are assumed to be known. From Example 5.20, 
we have the representation 


sin /Anz n [Koo sin Ant 


ey Pa 


dt, 0<a#<1, (5.63) 


where K satisfies (5.53a)—(5.53c) with K(1,1) = 5 Lt a t) dt = 0. From (5.63) 
for « = 1, we can compute K(1,t) ne = a ee 5.21, the functions 
Un(t) = sin /Ant form a complete system in L?(0,1). When we know only a 
finite number 1,...,An of eigenvalues, we suggest representing K(1,-) as a 
finite sum of the form 


N 
= S- ay Sin(krt) , 
k=1 


arriving at the finite linear system 


Mz 


1 
ay [ sin(krt sinv/A,tdt = —sinVA, forn=1,...,N. (5.64) 
0 


> 
Il 


a 


The same arguments yield a set of equations for the second boundary condition 
w'(1) + Hu(1) = 0 in the form 


1 


Vin COS./iin + H sin pn + [Kao +H K0,9) sin /jintdt = 0, 


0 


where now }i, are the corresponding known eigenvalues. The representation 


K,(1,t) + H K(1,t) = Shes 
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leads to the system 


gle: 


> 
Il 


1 
by f sin( nt) sin. /intdt = —./Un cos,/t, — HA sin,/p, (5.65) 
to 


for n = 1,...,N. Equations (5.64) and (5.65) are of the same form and we 
restrict ourselves to the discussion of (5.64). Asymptotically, the matrix A € 
RN*% defined by Agn = i. sin(knt) sin /A,,tdt is just $I. More precisely, 
from Parseval’s identity (see (A.8) from Theorem A.15 of AppendixA.2) 


2 ; five )Pat 


we conclude that (set W(t) = sin Ant — sin(nrt) for some n € N) 


6 1 


) sin(krt) aay 


~~" 0 


J sntent [sin al Agt= sin(nmt)] dt ; 
0 


> 
Il 
un 


1 
5 | isin Vat — sin ot) < 5|V%n— ne 
0 


where we used the mean value theorem. The estimate (5.30) yields |A,—n?7?| < 


él|q||oo and thus 
c 
\Vin—nm| < = alleo 


where c is independent of g and n. From this, we conclude that 


lag: 


1 
2 2 
i: sin(krt) [sin Ant — sin(nat)] dt] < 5 lal 
0 


> 
Il 


1 


The matrix A is thus diagonally dominant, and therefore, invertible for suffi- 
ciently small ||q||..._ Numerical experiments have shown that also for “large” 
values of g the numerical solution of (5.65) does not cause any problems. 


We are now facing the following inverse problem: Given (approximate values 
of) the Cauchy data f = K(1,-) € Hjp(0,1) and g = K,(1,-) € L*(0,1), 
compute q € L?(0,1) such that the solution of the Cauchy oe (ey ee 
(5.41c) for p = 0 and F = 0 assumes the boundary data K(2,«x) = 5 fy a( 
for x € [0,1]. An alternative way of formulating the inverse problem is - 
start with the Goursat problem (5.53a)—(5.53c): Compute q € L?(0,1) such 
that the solution of the initial value problem (5.53a)—(5.53c) has Cauchy data 
f(t) = K(1,t) and g(t) = K,(1,t) for t € (0, 1]. 
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We have studied these coupled systems for K and gq in Theorem 5.18. Here 
we apply it for the case where Ff = 0. It has been shown that the pair (K,7r) 
solves the system 


0? K (a, t) 0? K (2, t) 


Ox? Ot? 
1 x 
K(a,2) = 5 | rat, O<a<l, 
0 
K(z,0) = 0, O0<a<1l, 
and 3 
K(1,t) = f(t) and agi (8) = oft) for all ¢ € [0,1] 
i 


if and only if w(€,7) = K(€+7,€ — 7) and r solve the system of integral 
equations (5.44a) and (5.44b). For this special choice of F’, (5.44b) reduces to 


1 


sna) ae / (RG 26=Hde- 4 oOn-1) 4 Ret), 666) 


x 


where we have extended f and g to odd functions on [—1, 1]. Denote by T(q) the 
expression on the right-hand side of (5.66). For the evaluation of T(q), one has 
to solve the Cauchy problem (5.41a)—(5.41c) for p = 0. Note that the solution 
K; that is, the kernel K(y, 2a — y) of the integral operator T, also depends on 
q. The operator T is therefore nonlinear! 

The requirement r = q leads to a fixed point equation ¢ = 2T(q) in L?(0,1). 
It was shown in [233] that there exists at most one fixed point g € L™(0,1) of 
T. Even more, Rundell and Sachs proved that the projected operator Py,T is 
a contraction on the ball By := {q € L™(0,1) : ||glloo < M} with respect to 
some weighted L°-norms. Here, Pjy denotes the projection onto By, defined 
by 

g(a), la(x)| < M, 
(Puq)(«) = { M signg(x), |¢(x)| > M. 
Also, they showed the effectiveness of the iteration method q+) = 2T(q)) 
by several numerical examples. We observe that for q°) = 0 the first iterate 
q™ is simply q@ (x) = 2g(2x — 1) + 2 f’(2x — 1), x € [0,1]. We refer to [233] 
for more details. 

As suggested earlier, an alternative numerical procedure based on the kernel 
function K is to define the operator S$ from L?(0,1) into Hgo(0, 1) x L7(0, 1) by 
S(q) = (K(1,-), K2(1,-)), where K solves the Goursat problem (5.53a)-(5.53c) 
in the weak sense. This operator is well-defined and bounded by Theorem 5.15, 
part (c). If f € Hg,(0,1) and g € L?(0,1) are the given Cauchy values K(1,-) 
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and K,(1,-), respectively, then we have to solve the nonlinear equation S'(q) = 
(f,g). Newton’s method does it by the iteration procedure 


ght) = gi) — sig) sa) —(F.9) |, B=. (6.67) 
For the implementation, one has to compute the Fréchet derivative of S. Using 
the Volterra equation (5.40) derived in the proof of Theorem 5.15, it is not diffi- 


cult to prove that S'is Fréchet differentiable and that $’(q)r = (W(1,-),W2(1,:)), 
where W solves the inhomogeneous Goursat problem 


Wera(x,t) — Weu(a,t) — q(a)W(a,t) = K(a,t)r(z) in Ao,  (5.68a) 


W(2,0) = 0, 0<2<1, (5.68b) 
fro fe. DSHS. (5.68c) 
0 


W(a,2) = 


In part (b) of Theorem 5.18, we showed that S’(qg) is an isomorphism. We 
reformulate this result. 


Theorem 5.27 Let q € L7(0,1) and K be the weak solution of (5.538a)-(5.53c). 
For every f € Hj,(0,1) and g € L7(0,1), there exists a unique r € L?(0,1) and 
a weak solution W of (5.68a)—(5.68c) with W(1,-) = f and W,(1,-) = g in the 
sense of (5.44a), (5.44b); that is, S’(q) ts an isomorphism. 

Implementing Newton’s method is quite expensive because in every step one 
has to solve a coupled system of the form (5.68a)—(5.68c). Rundell and Sachs 
suggested a simplified Newton method of the form 


ght) = gl) — SO)? | Sle) = Gag) |y Fe O,1ys.. 


Because S(0) = 0, we can invert the linear operator $’(0) analytically. In 
particular, we have $’(0)r = (W(1,-),W.(1,-)), where W now solves 
Wea (a, t) = Wu(a, t) = 0 in Ao, 


1 
W(a#,0) = 0, and W(a,z2) = 5 [rear O0<a<l, 
0 


because also kK = 0. The solution W of the Cauchy problem 
W,2(2, t) = W(x, t) = 0 in Ao, 
W(1,t) = f(t), and W,(1,t) = g(t), OSt<1, 


is given by 


W(2,t) = —5 7 adr + <Fe+0-2) + 2f¢-0-0)); 


t—(1-2) 
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where we have extended f and g to odd functions again. The solution r of 
S'(0)r = (f,g) is, therefore, given by 


r(z) = 2 © W(2,2) = 2f'(2x—1) + 2g(2x%—-1). 


In this chapter, we have studied only one particular inverse eigenvalue problem. 
Similar theoretical results and constructive algorithms can be obtained for other 
inverse spectral problems; see [4, 16]. For an excellent overview, we refer to the 
lecture notes by W. Rundell [232]. 


5.8 Problems 


5.1 Let g, f € C[0,1] and q(x) > 0 for all x € [0,1]. 


(a) Show that the following boundary value problem on (0, 1] has at most 
one solution u € C?(0, 1]: 


—u"(x) + q(x)u(z) = f(x), u(0) = u(1) = 0. (5.69) 


(b) Let v, and v2 be the solutions of the following initial value problems 


on [0, 1]: 
4i@) -a@ul) = 0, aO Ho, a=, 
—uy(xz) + g(x) ve(z) = 0, ve(1)=0, w%4(1)=1. 


Show that the Wronskian vjv2—v}v1 is constant. Define the following 
function for some a € R: 


_ f auj(a)vey), O< e<yK<l, 
G(z,y) = { avo(x)vi(y), O<y<axK<l. 


Determine a € R such that 
1 
ula) = [lew Fa)dy, 2 (0,1, 
0 


solves (5.69). 


The function G is called Green’s function of the boundary value prob- 
lem (5.69). 


Show that the eigenvalue problem 


—u"(c) + q(z)ule) = du(z), u(0) = (1) = 0, 


Fee 
oe) 
A 


is equivalent to the eigenvalue problem for the integral equation 


yuo) = [ew u(y) dy, a € (0,1). 
0 
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5.2 


5.3 
5.4 


5.5 
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Prove Theorem 5.7, parts (a) and (b) by the general spectral theorem 
(Theorem A.53 of Appendix A.6). 


(d) How can one treat the case of part (c) when g changes sign? 
Let H € R. Prove that the transcendental equation zcot z+ H =0 has a 


countable number of zeros z, and that 


Zn = (n+1/2)a + a 7 + O(1/n?). 


(n+ 1/2) 


From this, 


2 = (n+1/2)*nx? + 2H + O(1/n) 


follows. Hint: Make the substitution z = «+ (n+ 1/2)z, set ¢ =1/(n+ 
1/2)r, write z cot z+ H = 0 in the form f(x,¢) = 0, and apply the implicit 
function theorem. 


Prove Lemma 5.13. 


Let g € C[0,1] be real- or complex-valued and Xn, gn be the eigenvalues 
and L?-normalized eigenfunctions, respectively, corresponding to q and 
boundary conditions u(0) = 0 and hu’(1)+Hu(1) = 0. Show by modifying 
the proof of Theorem 5.21 that {g, :n € N} is complete in L?(0,1). This 
gives—even for real g—a proof different from the one obtained by applying 
the general spectral theory. 


Consider the eigenvalue problem on [0, 1]: 
—u" (x) + q(x) u(x) = Au(x), u(0) = uw(1) = 0. 


By modifying the proof of Theorem 5.22, prove the following uniqueness 
result for the inverse problem: Let (An, Un) and (Un, Un) be the eigenvalues 
and eigenfunctions corresponding to p and q, respectively. If A, = py for 
all n € N and 

Un(1) Un (1) 


= f ll N 
u,(0) (0) SY 


then p and q coincide. 


® 


Check for 
updates 


Chapter 6 


An Inverse Problem in 
Electrical Impedance 
Tomography 


6.1 Introduction 


Electrical impedance tomography (EIT) is a medical imaging technique in which 
an image of the conductivity (or permittivity) of part of the body is deter- 
mined from electrical surface measurements. Typically, conducting electrodes 
are attached to the skin of the subject and small alternating currents are applied 
to some or all of the electrodes. The resulting electrical potentials are measured, 
and the process may be repeated for numerous different configurations of applied 
currents. 

Applications of EIT as an imaging tool can be found in fields such as 
medicine (monitoring of the lung function or the detection of skin cancer or 
breast cancer), geophysics (locating of underground deposits, detection of leaks 
in underground storage tanks), or nondestructive testing (determination of 
cracks in materials). 

To derive the EIT model, we start from the time-harmonic Maxwell system 
in the form 


curl H + (iwe—y)E = 0, curl E — iwuH = 0 


in some domain which we take as a cylinder of the form B x R Cc R® with 
bounded cross-section B C R?. Here, w, €, y, and pz denote the frequency, elec- 
tric permittivity, conductivity, and magnetic permeability, respectively, which 
are all assumed to be constant along the axis of the cylinder; that is, depend 
on x, and x2 only. We note that the real parts Re[exp(—iwt) E(x)] and 
Re|exp(—iwt) H(x)] are the physically meaningful electric and magnetic field, 
respectively. For low frequencies w (i.e., for small (wyy)-L? where L is a typical 
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length scale of B), one can show (see, e.g., [47]) that the Maxwell system is 
approximated by 


culH —- yE =0, curlF = 0. 


The second equation yields the existence! of a scalar potential u such that E = 
—Vu. Substituting this into the first equation and taking the divergence yields 
div (yVu) = 0 in the cylinder. We restrict ourselves to the two-dimensional 
case and consider the conductivity equation 


div(yVu) = 0 inB. (6.1) 


There are several possibilities for modeling the attachment of the electrodes on 
the boundary OB of B. The simplest of these is the continuum model in which 
the potential U = ulag and the boundary current distribution f = yVu-v = 
+ 0u/Ov are both given on the boundary OB. Here, v = v(x) is the unit normal 
vector at x € OB directed into the exterior of B. First, we observe that,? by 
the divergence theorem, 


as [av Ou) az = [ya = j tt 
B 


OB 


that is, the boundary current distribution f has zero mean. In practice, f(z) 
is not known for all « € OB. One actually knows the currents sent along wires 
attached to N discrete electrodes that in turn are attached to the boundary OB. 
Therefore, in the gap model one approximates f by assuming that f is constant 
at the surface of each electrode and zero in the gaps between the electrodes. An 
even better choice is the complete model. Suppose that f; is the electric current 
sent through the wire attached to the jth electrode. At the surface S; of this 
electrode, the current density satisfies 

Ou 

Vay dé = f;. 

Sj 


In the gaps between the electrodes, we have 


Ou 


15-4 in B\JS;. 


If electrochemical effects at the contact of S; with OB are taken into account, 
the Dirichlet boundary condition u = U; on S; is replaced by 


Ou 
ut a te U; on S;, 


where z; denotes the surface impedance of the jth electrode. We refer to [20, 
145, 146, 252] for a discussion of these electrode models and the well-posedness 
of the corresponding boundary value problems (for given +). 


lif the domain is simply connected. 
2Provided y, f, and u are smooth enough. 
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In the inverse problem of EIT the conductivity function y is unknown and 
has to be determined from simultaneous measurements of the boundary voltages 
U and current densities f, respectively. 


In this introductory chapter on EIT, we restrict ourselves to the continuum 
model as the simplest electrode model. We start with the precise mathematical 
formulation of the direct and the inverse problem and prove well-posedness of 
the direct problem: existence, uniqueness, and continuous dependence on both 
the boundary data f and the conductivity y. Then we consider the inverse 
problem of EIT. The question of uniqueness is addressed, and we prove unique- 
ness of the inverse linearized problem. This problem is interesting also from 
an historical point of view because the proof, given in Calderén’s fundamental 
paper [38], has influenced research on inverse medium problems monumentally. 
In the last section, we introduce a technique to determine the support of the 
contrast y—71 where 7; denotes the known background conductivity. This fac- 
torization method has been developed fairly recently—after publication of the 
first edition of this monograph—and is a prominent member of a whole class of 
newly developed methods subsumed under the name Sampling Methods. 


6.2 The Direct Problem and the 
Neumann—Dirichlet Operator 


Let B C R? be a given bounded domain with boundary 0B and y: B > R and 
f : 0B > R be given real-valued functions. The direct problem is to determine 
u such that 


Ou 


div(yVu) = 0 in B, 1a 
Vv 


= f on OB. (6.2) 


Throughout this chapter, v = v(«) again denotes the exterior unit normal vector 
at x € OB. As mentioned in the introduction, we have to assume that [, api dl= 
0. Therefore, throughout this chapter, we make the following assumptions on 
B, y, and f: 


Assumption 6.1 (a) B C R? is a bounded Lipschitz domain? such that the 
exterior of B is connected. 


(b) 7 € L™®(B), and there exists yo > 0 such that y(x) > yo for almost all 
ce Bb. 


(c) f € L?(OB) where 


L2(9B) = etn et) 


3For a definition see, e.g., [191, 161]. 
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In this chapter and the following one on scattering theory, we have to use 
Sobolev spaces as the appropriate functions spaces for the solution. For a general 
theory on Sobolev spaces, we refer to the standard literature such as [1, 191]. 
For obvious reasons, we also refer to the monograph [161], Sections 4.1 and 5.1. 
Also, in Appendix A.5, we introduce and study the Sobolev space H1(B) for 
the particular case of B being the unit disc. At this place we recall only the 
very basic definition. 

For any open and bounded set B C R? the Sobolev space H!(B) is defined 
as the completion of C!(B) with respect to the norm 


llullencey = / [|Vul2 + Jul?) der. 
B 


An important property of Sobolev spaces for Lipschitz domains is the existence 
of traces; that is, for every u € H'(B) the trace ulap on OB is well-defined 
and represents an L?(0B)-function* (Theorem 5.10 in [161]). Also, there exists 
cr > 0 (independent of wu) such that ||ulaa||r2(aB) < cr|lulla+ce) for all u € 
H'(B); that is, the trace operator u+> ulap is bounded. 

We note that the solution wu of (6.2) is only unique up to an additive constant. 
Therefore, we normalize the solution u € H+(B) such that it has vanishing mean 


on the boundary; that is, u € H}(B) where 


H3(B) = {u € H'(B): [vee = of. (6.3) 
OB 
The formulation (6.2) of the boundary value problem has to be understood in 


the variational (or weak) sense. By multiplying the first equation of (6.2) with 
some test function ~ and using Green’s first formula we arrive at 


0 = Je div (yVu) da = - [V0 Yude + [ervu-vae 
B B dB 
= - [V0 Yude + pote. 
B dB 
Therefore, we define the variational solution u € H3(B) of (6.2) by the solution 
of 
[rye Vude = pera for all w € H3(B). (6.4) 
OB 


B 


Existence and uniqueness follows from the representation theorem due to Riesz 
(cf. Theorem A.23 of Appendix A.3). 


Theorem 6.2 Let Assumption 6.1 be satisfied. For every f € L2(0B) there 
exists a unique variational solution u € H3(B) of (6.2), that is, a solution 


4The trace is even more regular and belongs to the fractional Sobolev space H1/2(QB). 
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of the variational equation (6.4). Furthermore, there exists a constant c > 0 
(independent of f) such that ||ul|z1(B) < e|lf\lz2(aB). In other words: the 
operator f + u from L2(0B) to H}(B) is bounded. 


Proof: We define a new inner product in the space H}(B) by 


(u,V)x = [ve vodr, u,v € H3(B). 
B 


The corresponding norm ||ul|. = ./(u, wu)» is equivalent to the ordinary norm 
Il - |l2(8) in H}(B). This follows from Friedrich’s inequality in the form (see 
Theorem A.50 for the case of B being the unit disc and [1, 191] for more general 
Lipschitz domains): 
There exists cr > 0 such that 

\|v|| z2(B) < cP |Vvl|z2(B) for all v € Hi (B) : (6.5) 


Indeed, from this the equivalence follows inasmuch as 


Yo 
Ee llolincey < yllVellzeqay S< Melle < lltlollullzcay (6.6) 
for all v € H}(B). For fixed f € L2(0B) we can interpret the right-hand side of 
(6.4) as a linear functional F' on the space H}(B); that is, F(w) = (f,~) 1208) 
for ~ € H}(B). This functional F is bounded by the inequality of Cauchy- 
Schwarz and the trace theorem (with constant cr > 0) because 


IFC) | 


lf llz2cazy lWilnz~aB) < erllfllz2azy IlYllance) 
elf llz2(aBy ||| 


with c = cry/(1+ cp)/ 0. In particular, || Fl z2(8)* < ¢||fl|z2(aB), and we can 
apply the representation theorem of Riesz in the Hilbert space (H2(B), (-,-)«): 
there exists a unique u € H}(B) with (u,v). = F(a) for all w € H}(B). This 
is exactly the variational equation (6.4). Furthermore, |||]. = ||F'l| zics)- and 
thus by (6.6), 


< 
< 


1+¢c 1+c} 1+¢é 
ulin) = xp ells = ap Ellaraces < I fllze(on) 5 


that is, the operator f ++ u is bounded from L2(0B) into H}(B). 


This theorem implies the existence and boundedness of the Neumann-Dirich- 
let operator. 


Definition 6.3 


The Neumann-Dirichlet operator A : L2(0B) + L2(0B) is defined by Af = 
ulop, where u € H}(B) is the uniquely determined variational solution of (6.2); 
that is, the solution of (6.4). 
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Remark: This operator is bounded by the boundedness of the solution map 
fu from L2(0B) to H}(B) and the boundedness of the trace operator from 
H23(B) to L2(0B). It is even compact because the trace operator is compact 
from H}(B) to L2(0B). However, we do not make use of this latter property. 


We show some properties of the Neumann-—Dirichlet operator. 


Theorem 6.4 Let Assumption 6.1 be satisfied. Then the Neumann—Dirichlet 
map A is self-adjoint and positive; that is, (Af,g)r2(aB) = (f,Ag9)z2(aB) and 
(Af, f)z2(ap) > 0 for all f,g € LZ(OB), f #0. 

Proof: This follows simply from the definition of A and Green’s first identity. 
Let u,v € H2(B) be the solutions of (6.4) corresponding to boundary data f 
and g, respectively. Then, by (6.4) for the pair g,v and the choice w = u (note 
that ulap = Af), 


(Af, 9)z2(0B) = [gat ms [ove Vode, 
dB B 


and this term is symmetric with respect to u and v. For f = g this also yields 
that A is positive. 


In the following, we write A, to indicate the dependence on y. The next 
interesting property is a monotonicity result. 


Theorem 6.5 Let 71,72 € L©(B) with y1 > 72> yo ae. on B. Then Ay, < 
Ay, in the sense that 


Ant Des S And d) eu fr ole ek, 08): 


Proof: For fixed f € L2(0B) let uj; € H(B) be the corresponding solution of 
(6.2) for y;, 7 = 1,2. From (6.4) (for y = y2, u = ue, and w = uj, — ue) we 
conclude that 


(Ay — Aw) fi f) 12 oB) = ic =U) fdl = [ow (Vuz — Vua)- Vue dx 
dB B 


2 [|V us|? — |Vue|? — |V(ur — u2)|?] dx 


1 
Vui|? dx — 5 |e |Vu2|? da 
B 


IA 
Nl rR 


IA 
oH — a 


1 
v1 Vu? dx — 5 |e |Vug|? dx 
B 


1 
2 (Brg — Moa), f) (0B) 


which proves the result. 
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6.3. The Inverse Problem 


As described in the introduction, the problem of electrical impedance tomo- 
graphy is to determine (properties of) the conductivity distribution y from all 
—or at least a large number of—pairs (f,ulag). Because ulag = Af we can 
rephrase this problem as follows: 


Inverse Problem: Determine the conductivity y from the given Neumann— 
Dirichlet operator A, : L2(0B) > L2(0B)! 


As we have seen already in the previous chapter (and this is typical for 
studying inverse problems), an intensive investigation of the direct problem 
has to precede the treatment of the inverse problem. In particular, we study 
the dependence of the Neumann—Dirichlet map on y. First, we show with an 
example that the inverse problem of impedance tomography is ill-posed. 


Example 6.6 Let B = B(0,1) be the unit disk, g > 0 constant, and R € (0,1). 
We define yr € L™(B) by 


ie 1, R<|el\<1, 
a8 ~ | 144, lel <k. 


Because yr is piecewise constant the solution u € H2(B) is a harmonic function 
in B(O,1)\{a : |a| = R} (that is, Au = 0 in B(O,1)\ {x : |x| = R}) and satisfies 
the jump conditions u_ = uz and (1+ @)Ou/Or|_ = Ou/Or|+ for |x| = R where 
v|4 denotes the trace of v from the interior (—) and exterior (+) of {x : |x| = R}, 
respectively. We refer to Problem 6.1 for a justification of this statement. 

We solve the boundary value problem (6.2) by expanding the boundary data 
f € L2(OB) and the solution wu into Fourier series; that is, for 


fv) = Vo fe”, ge [0,2n], 
nZ0 


we make an ansatz for the solution of (6.2) in the form 


Yo (On + cn) (g)" laa r<R, 
wre) = 9 ne 
SS [buh en (GG) |e. roR, 
n#0 


The ansatz already guarantees that u is continuous on the circle r = R. The 
unknown coefficients b,,, c, are determined from the conditions (1+¢) 0u/ ar|_ = 


du/Or|_, for r= Rand Ou/Or = f for r= 1. This yields the set of equations 
(1 + G) (On, at Cn) _ bn — Cn 


and 
|n| 


Dn 
Ril 


— en|n|R™ = fy 
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for all n 4 0 which yields explicit formulas for b,, and c,. Substituting this into 
the form of u and taking r = 1 yields 


a— R22" a 


Ss eine 0,2 6.7 


(Aynf)(~) = ule) = SO 


nZ0 


with a = 1+2/q. We observe that A,, is a diagonal operator from L2(0, 27) 
into itself with eigenvalues that behave asymptotically as 1/|n|. Therefore, the 
natural setting for A,, is to consider it as an operator from the Sobolev space 
Hs */?(0, 2m) of order —1/2 into the Sobolev space He/?(0, 2m) of order 1/2; 
see Section A.4 of Appendix A. We prefer the setting in L2(0,27) because the 
more general setting does not give any more insight with respect to the inverse 
problem. 
Let A, be the operator with y = 1 which is given by 


(Ai f)(y) = S- hie PE [0, 27] . 


We estimate the difference by 


2 
; 7 a Rein ho 
Are Av SMieoan = 2" 2) o pat 1 7p 
. Rint fal? 
fo (a+ Rl) 
4 R' 


IA 


“ae Il fllZ2¢0,20) ; 


that is, 
|Ave — Aille(t2(0,2m)) < = < 2h" 


because a > 1. Therefore, we have convergence of A,, to A; in the operator 
norm as R tends to zero. On the other hand, the difference ||yz — 1||. = ¢ is 
constant and does not converge to zero as R tends to zero. This shows clearly 
that the inverse problem to determine yz from A is ill-posed. 

One can argue that perhaps the sup-norm for ¥ is not appropriate to measure 
the error in y. Our example, however, shows that even if we replace g by 
a constant gr which depends on R such that limr.9 Gr = oo we still have 
convergence of A,, to A; in the operator norm as R tends to zero. Taking, for 
example, Gr = G/R°, we observe that ||7z — 1||z2(8) 4 00 as R tends to zero 
for arbitrary p > 1, and the problem of impedance tomography is also ill-posed 
with respect to any L?-norm. 


A fundamental question for every inverse problem is the question of unique- 
ness: is the information—at least in principle—sufficient to determine the 
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unknown quantity? Therefore, in electrical impedance tomography, we ask: 
does the knowledge of the Neumann—Dirichlet operator A determine the con- 
ductivity y uniquely or is it possible that two different ~ correspond to the same 
A? 

In full generality, this fundamental question was not answered until 2006 by 
K. Astala and L. Paivarinta in [10]. We state the result without proof. 


Theorem 6.7 Let 71,72 € L°(B) with y;(x) > yo for 7 = 1,2 and almost all 
x € B. We denote the corresponding Neumann—Dirichlet operators by Ay and 
Ao, respectively. If Ay = Ag then 71 = y2 in B. 


Instead of proving this theorem which uses refined arguments from complex 
analysis, we consider the linearized problem. Therefore, writing A(y) instead of 
A. to indicate the dependence on y, we consider the linear problem 


A(y) + A’(y)a = Ameas; (6.8) 


where A’(7) : L°(B) + £(L?(AB)) denotes the Fréchet derivative of the non- 
linear operator y ++ A(7) from L°°(B) to £(L2(OB)) at 7. Here, £(L2(0B)) 
denotes again the space of all linear and bounded operators from L?(0B) 
into itself equipped with the operator norm. The right-hand side Ameas € 
£(L7(OB)) is given (“measured”), and the contrast g € L°(B) has to be deter- 
mined. 


Theorem 6.8 
Let U Cc L®(B) be given by U = {7 € L©(B) : > ae. on B}. 


(a) The mapping y++ A(y) from U to £L(L2(OB)) is Lipschitz continuous. 


(b) The mapping y + A(y) from U to £L(L2(OB)) is Fréchet differentiable. 
The Fréchet derivative A'(7y) at y € U in the direction q € L™(B) is given 
by [A’(y)q] f = vlap where v € Hi(B) solves 

Ov Ou 
aa ep OB : 

Vay = 715, OM OB, (6.9) 

and u € H2}(B) solves (6.2) with data y € U and f € L?(0B). The 

solution of (6.9) is again understood in the weak sense; that is, 


div (yVv) = —div (qVu) in B, 


[ve Vode = — [ave Vude for all € H}(B). — (6.10) 
B B 


Proof: (a) Let 71,72 € U, f € L2(0B), and us,u2 € Hi(B) be the corre- 
sponding weak solutions of (6.2). Taking the difference of (6.4) for the triples 
(Vis U1; f) and (y2, U2, f) yields 


fn V(u1 — U2): Vida = [oe —71) Vu2:-Vidax for all WE H3(B). 
B B 
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With w = uy — ug and the lower bound yo < 7 this yields 


2 
7ollV(u ~w2)IRas) <_ fa |V(u ua)? ae 
B 
= [oe — 71) Vug- V(u1 — ug) dx 
B 
< |ly1 — alloc ||V(u1 — v2) |Lz2 (By ||Vuall raz) 3 


that is, there exists a constant c,; > 0 (independent of 7,72) with 


x 


1 
|V(u1 — ua)\ln2~B) < a v1 — Y2lloo I|Vuallz2(z) 


IA 


€1 ||¥1 — alloc Il fll z2;aB) ) (6.11) 


where we use the boundedness of the mapping f > uz (see Theorem 6.2). Now 
we use the trace theorem and (6.6) to conclude that 


AQ) f -A(q2) fllzz(aB) = ||(41 - U2) lo pll z2can) < ¢2|l11 — lle ll fllz2(azy ; 


that is, 
|A(y1) — AQ a) Ile2@By) S ¢2 [lt — valloo 
which proves part (a). 


(b) Let y € U and q € L™(B) such that ||q||o. < yo/2. Then y+ ¢ > yo/2 ae. 
on B. Let u,uq € H}(B) correspond to y and y+ 4, respectively, and boundary 
data f. Subtraction of (6.4) for the triple (y,u, f) and (6.10) from (6.4) for 
(7+ 4, Uq, f) yields 


[Vg — 4-2) Vode — fava u) Vode for all x € H3(B). 
B 


Taking ~ = u, — u— v yields as in part (a) an estimate of the form 
IV (ug ~~ v}llaaca) S — lll [VC ~ ua) leacay 
Now we use (6.11) (with wu; = u and ug = uy) to conclude that 
(ug — 4 = vflleacsy S lla lfllz2@e)- 


Again by the trace theorem and (6.6) this yields 


Adv + af -— At) f - [A’(7)q] fllz2(aB) = Il (eq —4—)oallra(en) 


< cllall% lf llz2(aBy, 


which proves part (b). 
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We now show that, for any given constant background medium 7, the lin- 
earized inverse problem of electrical impedance tomography (6.8) has at most 
one solution; that is, the Fréchet derivative is one-to-one. As already mentioned 
in the introduction, this proof is due to Calderén (see [38]) and has “opened 
the door” to many uniqueness results in tomography and scattering theory. We 
come back to this method in the next chapter where we prove uniqueness of an 
inverse scattering problem by this method. 


Theorem 6.9 Let y be constant. Then the Fréchet derivative A’(y) : L*°(B) > 
L(L2(OB)) is one-to-one. 


Proof: First we note that we can assume without loss of generality that y = 1. 
Let g € L®(B) such that A’(y)q = 0; that is, [A’(7)q] f =0 for all f € L2(0B). 


The proof consists of two parts. First, we show that q is orthogonal to all 
products of two gradients of harmonic functions. Then, in the second part, by 
choosing special harmonic functions we show that the Fourier transform of q 
vanishes. 

Let u; € C?(B) be any harmonic function; that is, Au; = 0 in B. Define 
f € L2(0B) by f = Ou,/dv on OB. Then wu, is the solution of (6.2) with 
Neumann boundary data f. We denote by v,; € H3(B) the corresponding 
solution of (6.10); that. is, 


[ve vod — — [ a9: Vu de for all w € H}(B). 
B B 


Now we take a second arbitrary harmonic function uz € C?(B) and set w = ue 
in the previous equation. This yields 


favs Vu de => = [ Yuz-Vor de = - fu seat 
B B 


by Green’s first theorem. Now we note that vi|ag = [A’(7)q] f = 0. Therefore, 
we conclude that the right-hand side vanishes; that is 


paves -Vuidx = 0 for all harmonic functions ui, uz € C?(B). (6.12) 
B 


So far, we considered real-valued functions u; and u2. By taking the real and 
imaginary parts, we can also allow complex-valued harmonic functions for uj; 
and U2. 

Now we fix any y € R? with y 4 0. Let y+ € R? be a vector (unique up 
to sign) with y- y+ = 0 and |y| = |yt|. Define the complex vectors z* € C? 
by z+ = 4 (iyty*). Then one computes that z*-2* = ye ya12 (27)? = 0 and 


ztez7 = Vie z} 2; =—35lyl? and 2+ +27 = iy. From this, we observe that 
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the functions u*(x) = exp(z*-a), x € R?, are harmonic in all of R?. Therefore, 
substituting ut and u~ into (6.12) yields 


_ 1 
0 = pavetvun dz = sham f(a) ete) dz = —SWul? fae) ey” da. 
B B B 


From this, we conclude that the Fourier transform of qg (extended by zero in the 
exterior of B) vanishes on R? \ {0}, and thus also q itself. This ends the proof. 


6.4. The Factorization Method 


In this section, we consider the full nonlinear problem but restrict ourselves 
to the more modest problem to determine only the shape of the region D, 
where ¥ differs from the known background medium which we assume to be 
homogeneous with conductivity 1. 


We sharpen the assumption on y of Assumption 6.1. 


Assumption 6.10 In addition to Assumption 6.1, let there exist finitely many 
domains D;, 7 =1,...,m, such that Dj cB and Dj N Dy = 0 forj 4k and 
such that the complement B \ D of the closure of the union D = Uji D; is 
connected. Every domain D; is assumed to satisfy the exterior cone condition 
(see. e.g., [103]); that is, for every z € OD; there exists a set C (part of a cone) 
of the form 


C= 2+ {eeR?:6: 2 > 1-4, 0<|e]<aol 


for some €9,6 > 0 and 6 € R? with \6| =1, such that C11 D, = 0. 

Furthermore, there exists gg > 0 such that y= 1 0n B\ Dandy >1+4 
on D. We define the contrast q € L~(B) by g=y—1 and note that D is the 
support of q. 


It is not difficult to show that every Lipschitz domain D satisfies the exterior 
cone condition (see Problem 6.6). 

The inverse problem of this section is to determine the shape of D from 
the Neumann-—Dirichlet operator A. 


In the following, we use the relative data A — A; where A, : L2(0B) > 
L?(0B) corresponds to the known background medium; that is, to y = 1. The 
information that A — A; does not vanish simply means that the background is 
perturbed by some contrast g = y — 1. In the factorization method, we develop 
a criterion to decide whether or not a given point z € B belongs to D. The idea 
is then to take a fine grid in B and to check this criterion for every grid point 
z. This provides a pixel-based picture of D. 
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We recall that Af = ulag and Ai f = wilog, where u, ui € Hi(B) solve 


fo +q)Vu-Vodc = (f,)12(ap) for all pe Hi(B), (6.13) 
B 


[vu Yeas = (f,v)r20n) forallp¢ Hj(B). (6.14) 


For the difference, we have (A; — A)f = (ui — u)lag, and u; — u € H2(B) 
satisfies the variational equation 


[a+ Vn -u)-Vodr = fava Vode for all  € Hi(B). (6.15) 
B D 


It is the aim to factorize the operator A, — A in the form 
Ay -A = ATA, 


where the operators A : L2?(0B) > L?(D)? and T : L?(D)? > L?(D)? are 
defined as follows:° 


e Af = Vuil|p, where u; € H}(B) solves the variational equation (6.14), 
and 


e Th=q(h— Vw) where w € H}(B) solves the variational equation 


[oto vw Vode = panded for ally € Hi(B). (6.16) 
B D 


We note that the solution w of (6.16) exists and is unique. This is seen by the 
representation theorem A.23 of Riesz because the right-hand side again defines 
a linear and bounded functional F(w) = J, qh- Vy dx on H}(B). The left-hand 
side of (6.16) is again the inner product (w,w),. The classical interpretation of 
the variational equation (under the assumption that all functions are sufficiently 
smooth) can again be seen from Green’s first theorem, applied in D and in B\ D. 
Indeed, in this case (6.16) is equivalent to 


oe (1+ q) Vw —qh] dx — ee a 


+f wy Awdx — / wy oe a 
B\D a(B\D) 


for all v. This yields 
div [(l+q)Vw-qh] =0inD, Aw=0in B\D, 


5Here, L?(D)? denotes the space of vector-valued functions D — R? such that both 
components are in L?(D). 
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and 
Ow Ow 
Vv: [(1 + q) Vw — gh]|_ =i, on OD, Bp 0 0n OB; 
+ 
that is, 
Ow Ow 
v- [(1+q)Vu]|_ — Ba er one Fp 000 OB. 


Theorem 6.11 Let the operators A : L2(0B) > L?(D)? and T : L?(D)? > 
L?(D)? be defined as above by (6.14) and (6.16), respectively. Then 


ok = re. (6.17) 


Proof: We define the auxiliary operator H : L?(D)? > L2(0B) by Hh = wlap 
where w € H2(B) solves (6.16). Obviously, we conclude from (6.15) that A, — 
A=HA. 

We determine the adjoint A* : L?(D)? + L?(0B) of A and prove that 
A*h = v|ap where v € H}(B) solves the variational equation 


[ve veae = [reveae for all y € H3(B) (6.18) 
B D 
and even for all w € H'(B) because it obviously holds for constants. The solu- 


tion v exists and is unique by the same arguments as above. Again, by applying 
Green’s theorem we note that v is the variational solution of the boundary value 


problem ak aD 9 
iv in D, v 
Av = { 0 HB \ Ds Ay =0Oon OB, (6.19a) 
) fo) 
vj =v\j, on OD, ==—v|_ — —v|4 =v-honoD. (6.19b) 
OV OV 


To prove the representation of A*h, we conclude from the definition of A, equa- 
tion (6.18) for % = ui, and (6.14) that 


(Af, h) r2(py2 = [Yur hde = [Yu Vode = (f,v)12(aB) » 
D B 


and thus v|gp is indeed the value of the adjoint A*h. 
Now it remains to show that # = A*T. Let h € L?(D)? and w € H}(B) 
solve (6.16). Then Hh = wlap. We rewrite (6.16) as 


[vw vede = falh—Vu)- Vode for all 7 € H3(B). (6.20) 
B D 

The comparison with (6.18) yields A*(q(h — Vw)) = wlag = Hh; that is, 

A*T =H. Substituting this into A; — A = HA yields the assertion. 


Properties of the operators A and T are listed in the following theorem: 
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Theorem 6.12 The operator A: L2(0B) > L?(D)? is compact, and the oper- 
ator T : L?(D)? - L?(D)? is self-adjoint and coercive 


(Th, h)za(p)2 > ellAllz2cpy2__ for all h € L?(D)?, (6.21) 
where c= qo(1—4q0/(1+4)) > 0. 


Proof: (i) For smooth functions u; € C?(B) with Au; = 0 in B and 0ui/0v = f 
on OB the following representation formula holds (see [53] or Theorem 7.16 for 
the case of the three-dimensional Helmholtz equation). 


wa) =f (eeu) SY — ay) go a(e.y)| atts 
OB 
= | [ren 10)- Ono green] au), eB, 
OB 


where ® denotes the fundamental solution of the Laplace equation in R?; that 
is, 


1 
O(z,y) = —5 Inje—yl, c#y. 
T 
We can write Vu; in D in the form Vuj les = K,f — K2A,f where the operators 
ky, Kz : L2(0B) > L?(D)?, defined by 


(Kif(e) = V / B(x,y) fly) dey), 2 €D, 
OB 


(Kag\(o) = ¥ fo ge Peu) dy), ced, 
OB 


are compact as integral operators on bounded regions of integration with smooth 
kernels. (Note that D C B.) The representation A = Ky — K2A, holds by a 
density argument (see Theorem A.30). Therefore, also A is compact. 


(ii) Let hi, hg € L?(D)? with corresponding solutions w1,w2 € H2(B) of (6.16). 
Then, with (6.16) for he, wa and w= wi: 


(Thi, ha) p2 py = qd (hi = Vuw1) ho dx 
(D) 


gh, -hodx — [avur-hede 
D 


ee) 


ghi-hodx — [Oto Vu Vurde, 
B 


This expression is symmetric with respect to h, and hg. Therefore, T is self- 
adjoint. 
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For h € L?(D)? and corresponding solution w € H}(B) of (6.16), we con- 
clude that 


(Th, h) 12(p)2 


falh—vul ae + faln— Vw) -Vwae 
D D 


= falh- vu de + [iver ae (with the help of (6.20)) 
B 


D 
> [lao tn? -2q0h- Vw (1+ 40) [Vu de 
D 
qo ° qo 
= J/1+qo Vw - h| +q{1—- aa: 
/| we eh +m ( rh) me 
= 


do 2 
qo (1 _ | hllz2(py2 : 


From this result and the factorization (6.17), we note that A, —A is compact, 
self-adjoint (this follows already from Theorem 6.4), and nonnegative. 


Now we derive the binary criterion on a point z € B to decide whether or 
not this point belongs to D. First, for every point z € B we define a particular 
function G(-,z) such that AG(.,z) = 0 in B \ {z} and 0G(.,z)/Ov = 0 on 
OB such that G(x, z) becomes singular as x tends to z. We construct G from 
the Green’s function N for A in B with respect to the Neumann boundary 
conditions. 

We make an ansatz for N in the form N(2,z) = ®(a,z) — N(a,z) where 
again 

O(x,z) = = Injx-—z|, tz, 
is the fundamental solution of the Laplace equation in R? and determine N(, 2) € 
H2}(B) as the unique solution of the Neumann problem 


6a 6g one. 


AN(-,z)=Oin B and OB] 


Ov (ey z) —_ Ov 
We note that the solution exists because f [0®(-,2)/Ov+1/|OB|| dé = 0. This 
dB 


is seen by Green’s first theorem in the region B \ B(z,¢): 


O® O® 
Bp 67?) dé = / Bp (%*) de(x) 
OB |ju—z|=e 
1 L—-—z L—-—z 
Q0 / jc—z2|? |e-2| 2) = 
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Then N = ®-—N is the Green’s function in B with respect to the Neumann 
boundary conditions; that is, N satisfies 
ON 1 


AN(.,z) =0in B\{z} and rr ae DB 


on OB. 


From the differentiable dependence of the solution N(-,z) on the parameter 
z € B, we conclude that, for any fixed a € R? with |a| = 1, the function 
G(.,z) =a-V.zN(., z) satisfies 
: OG 
AG(.,z) =0in B\ {z} and By br) =0on OB. (6.22) 
Vv 


The function G(-, z) has the following desired properties. 


Lemma 6.13 Let z © B, &) > 0, 6 € [0,27], and 6 > 0 be kept fixed. For 
€ [0,€0) define the set (part of a cone, compare with Assumption 6.10 for 


8 = (Sno) 


st 
Ce = 24+ {r(o ) LE <r <0, ot] < arccos(1 ~ 8)} 


sint 
with verter in z. Let ey be so small such that Co C B. Then 
lim |G. 2)llaa(c.) = 00. 
Proof: By the smoothness of N(-,z) it is sufficient to consider only the part a: 


VIn|«—z|. Using polar coordinates for x with respect to z (i.e., © = z+r(* ay 


sint 
and the representation of a as a = ($°°®), we have with 1 = arccos(1 — 6) 


eo 9+ 


fla: Vein|e ~ 2|)Pae = pce r c= ff econ EGOS BS) ng 
uz 
Ce Ce € 0-n 
6+ 24 
= J costts— tat [ < ar = ch 2 
r € 
O—n € 
a 


Therefore, 


E - 
IIG(-, 2)lln2(c.) = jen = = la- V.N(-, 2)|lz2¢C9) > co fore0. 


We observe that the functions ¢.(%) = G(-,z)|ag are traces of harmonic 
functions in B \ {z} with vanishing normal derivatives on 0B. Comparing this 
with the classical formulation (6.19a) (6.19b) of the adjoint A* of A it seems to 
be plausible that the “source region” D of (6.19a), (6.19b) can be determined 
by moving the source point z in ¢,. This is confirmed in the following theorem. 
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Theorem 6.14 Let Assumptions 6.10 hold and let a € R? with |a| = 1 be fixed. 
For every z € B define 6, € L2(OB) by 


o.(4) = G(x,z) = a-V.N(a,z), we OB, (6.23) 


where N denotes the Green’s function with respect to the Neumann boundary 
condition. Then 
zE€D <= ¢,€R(A*), (6.24) 

where A* : L?(D)? — L2(0B) is the adjoint of A, given by (6.18), and R(A*) 
ats range. 
Proof; First let z € D. Choose a disc B[z,<e] = {x € R?: |x — 2| < e} with 
center z and radius ¢ > 0 such that B[z,<] C D. Furthermore, choose a function 
y € C®(R?) such that v(x) = 0 for |z — z| < €/2 and v(x) = 1 for |x — z| > € 
and set w(x) = v(x)G(a, z) for x € B. Then w € A}(B) and w = G(-,z) in 
B\ D, thus wlog = ¢:. 

Next, we determine u € H}(D) as a solution of Au = Aw in D, 0u/dv = 0 
on OD; that is, in weak form 


[vu veas = [vw veas - [ogee ~ € Hy(D), 
D D OD 


because 0w/Ov = OG(-,z)/Ov on OD. Again, the solution exists and is unique. 
Application of Green’s first theorem in B \ D yields 


f) a 
ie xceok = cook = 0. 
OD OB 


Therefore, the previous variational equation holds also for constants and thus 
for all  € H'(D). Now let  € H}(B) be a test function on B. Then 


[vw vods - [egat.aae 
D aD 


pane 4 / VG(.,z)- Vode = ae 


B\D 


[veveds 
D 


Therefore, the definition h = Vu in D yields A*h = wlag = ¢, and thus 
oz € R(A*). 

Now we prove the opposite direction. Let z ¢ D. We have to show that ¢, 
is not contained in the range of A* and assume, on the contrary, that ¢, = A*h 
for some h € L?(D)?. Let v € H3(B) be the corresponding solution of (6.18). 
Therefore, the function w = v — G(., z) vanishes on OB and solves the following 
equations in the weak form 

Ow 


Aw = 0 in B\ D(z), By 000 OB, 
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for every € > 0 such that D-(z) := DU B(z,e) C B; that is, 


Vu: Vidz = 0 


B\D-(z) 


for all » € H'(B \ D.(z)) with w = 0 on OD.(z). We extend w by zero into 
the exterior of B. Then w € H'(R? \ D.(z)) because w = 0 on 0B° and 


Vw:-Vyd« = 0 


R*\ De(z) 


for all» € H1(R?\ D.(z)) which vanish on 0D.(z). This is the variational form 
of Aw = 0 in R? \ D.(z). Since this holds for all sufficiently small ¢ > 0 we 
conclude that Aw = 0 in the exterior 2 := R? \ (DU {z}) of DU {z}. Now we 
use without proof’ that w is analytic in this set and thus satisfies the unique 
continuation principle, see, e.g., Theorem 4.39 of [161]. Therefore, because it 
vanishes in the exterior of B it has to vanish in all of the connected set 0. (Here 
we make use of the assumption that B\ D is connected.) Therefore, v = G(-, z) 
in B\ (DU {z}). 

The point z can either be on the boundary OD or in the exterior of D. 
In either case there is a cone Cp of the form Cp = {2+ r(S8t) :0<7r < 
eo, |9—t| <5} with Co C B\ D. (Here we use the fact that every component 
of D satisfies the exterior cone condition.) It is vlc, € L?(Co) because even 
v € H3(B). However, Lemma 6.13 yields that ||G(-, z)||z2(¢.) + 00 for e > 0 
where C2 = {z+ r(Sf) :e <1 <0, |@—t| < 6}. This is a contradiction 


because v = G(-,z) in Co and ends the proof. 


Therefore, we have shown an explicit characterization of the unknown domain 
D by the range of the operator A*. This operator, however, is also unknown: 
only A, — A is known! The operators A* and A, — A are connected by the 
factorization Ay — A = A*T'A. We can easily derive a second factorization of 
A, — A. The operator A, — A is self-adjoint and compact as an operator from 
L?(OB) into itself. Therefore, there exists a spectral decomposition of the form 


(Ay = 20s (f, Yj) L2(aB) V5» 


where A; € R denote the eigenvalues and w; € L2(0B) the corresponding 
orthonormal eigenfunctions of A; — A (see Theorem A.53 of Appendix A.6). 
Furthermore, from the factorization and the coercivity of T it follows that 


SIt is not quite obvious that the extension is in H1(R? \ Dz(z)), see, e.g., Corollary 5.13 
in [161] 
“see again [161], Theorem 4.38 and Corollary 3.4 
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((Ai — A)f, f) ran) > 0 for all f € L2(0B). This implies A; > 0 for all j. 
Therefore, we can define 


Wf = So VAG (F,5) 2208) Vy 5 
n=1 


and have a second factorization in the form WW = A,—A. We write (A,;—A)!/? 
for W. The operator (A; — A)!/? is also self-adjoint, and we have 


(A, — A)/?(A, — A)? = AY-A = A*T A. (6.25) 
We show that the ranges of (A; — A)'/? and A* coincide.® This follows directly 
from the following functional analytic result: 


Lemma 6.15 Let X and Y be Hilbert spaces, B: X > X, A: X > Y, and 
T:Y —+Y linear and bounded such that B = A*T A. Furthermore, let T be 


self-adjoint and coercive; that is, there exists c > 0 such that (Ty, y)y > elly||?- 
for ally © Y. Then, for any pe X, 6 #0, 


PE R(A*) > inf{(Ba,x)x :xcEX, (x, ¢)x =1} > 0. 


Proof: (i) First, let 6 = A*y € R(A*) for some y € Y. Then y 4 0, and we 
estimate for arbitrary « € X with (7, ¢)x =1: 


Cc Cc 2 
= Aa||}-\|y||} > Ax,y 
weg elvll+ = Tye |(Ax, y)y| 
Cc 2 Cc 2 Cc 
= x, A*y = x, = : 
pie Axl = Tye le@ ox! = Tye 


Therefore, we have found a positive lower bound for the infimum. 
(ii) Second, let ¢ ¢ R(A*). Define the closed subspace 


V = {we X:(6,2)x=0} = {6}. 


We show that the image A(V) is dense in the closure of the range of A. Indeed, 
let y € closure(R(A)) such that y L Ax for all x € V; that is, 0 = (Az, y)y = 
(x, A*y) for all 2 € V; that is, A*y €¢ V+ = span{¢}. Because ¢ ¢ R(A*) we 
conclude that A*y = 0. Therefore, y € closure(R(A)) MN N(A*). This yields 
y =0.° Therefore, A(V) is dense in closure(R(A)). Because Ad/||d||% is in the 
range of A there exists a sequence Z, € V such that AZ, > —Ad/||@||%. We 
define ty := En + $/||6||%. Then (an, ¢)x = 1 and Ar, > 0 for n > oo, and 
we estimate 


(Ban, tn) x = (TAan, AQn)y < IT ec | Atal} — 0, no, 


8This is also known as Douglas’ Lemma, see [75]. 
°Take a sequence (2;) in X such that Ax; — y. Then 0 = (A*y,2j)x = (y, Axj)y > 
(y,y)y; that is, y = 0. 
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and thus inf{(Bz,r)x:c€X, (x,¢)x =1}=0. 


We apply this result to both of the factorizations of (6.25). In both cases, 
B=A,—Aand X = L2(0B). First, we set Y = L?(D)? and A: L?(0B) > 
L?(D)? and T : L?(D)? — L?(D)? as in the second factorization of (6.25). 
Because T is self-adjoint and coercive we conclude for any ¢ € L2(0B), ¢ 4 0, 
that 


b= R(A*) 4 inf{((A1—A)f, f) agp: f € L208), (F,6)r2¢00) = 1} > 0. 


Second, we consider the first factorization of (6.25) with T being the identity. 
For ¢ € L2(0B), ¢ #0, we conclude that 


PER((Ar—A)?) < inf{((A1 — A), f) po(op) : (f,¢) 228) =1} > 0. 


The right-hand sides of the characterizations only depend on A, — A, therefore, 
we conclude that 
R((A1—A)/?) = R(A*). (6.26) 


Application of Theorem 6.14 yields the main result of the factorization method: 


Theorem 6.16 Let Assumptions 6.10 be satisfied. For fixed a € R? with a £0 
and every z € B let 6, € L2(OB) be defined by (6.23); that is, ,(%) = a- 
V.N(a,z), « € OB, where N denotes the Green’s function for A with respect 
to the Neumann boundary conditions. Then 


zED <=> ¢€R((Ar—A)”). (6.27) 


We now rewrite the right-hand side with Picard’s Theorem A.58 of 
Appendix A.6. First, we show injectivity of the operator A, — A. 


Theorem 6.17 The operator A; — A is one-to-one. 
Proof: From 


(Ai —A)F, f) r2cap) = (ATA, f)r2~aB) = (TAS, Af) 12(p) 
cl|Af|lZ2(p)2_ for f € L3(OB) 


V 


it suffices to prove injectivity of A. Let Af = Vui|p = 0 where wu; € H2(B) 
denotes the weak solution of Au; = 0 in B and Ou, /Ov = f on OB. Therefore, 
Vu, is constant in every component of D. Without proof, we use again the 
regularity result that ui is analytic in B. The derivatives vj; = Ou /Ox; are 
solutions of Av; = 0 in B and vj; = 0 in D. The unique continuation property 
yields vj; = Ou, /Ox; = 0 in all of B and thus f = 0. 


Therefore, the operator A, — A is self-adjoint, compact, one-to-one, and 
all eigenvalues are positive. Let {A,,~;} be an eigensystem of A; — A; that 
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is, A; > 0 are the eigenvalues of A; — A and w; € L2(0B) are the corre- 
sponding orthonormal eigenfunctions (see Theorem A.53 of Appendix A.6). 
The set {~; : j = 1,2,...} is complete by the spectral theorem. Therefore, 
{/j;,0;, ¥;} is a singular system of the operator (A; — A)!/?. Application of 
Picard’s Theorem A.58 of Appendix A.6 yields 


Theorem 6.18 Let Assumptions 6.10 be satisfied. For fixed a € R? with a 40 
and for every z € B let ¢, € L2(OB) be defined by (6.23); that is, 6.(z) = 
a-V,N(a,z),«€OB. Then 


= bz, 0;)7 
zED ——s Se 


(6.28) 
jel ‘ 
or, equivalently, 
© ($e, 5)3 - 
zeED = x(z) := ppcee 0s (6.29) 
j=l 


Here we agreed on the setting that the inverse of the series is zero in the 
case of divergence. Therefore, x vanishes outside of D and is positive in the 
interior of D. The function 


sign y(z) = { ; 


is thus the characteristic function of D. 


We finish this section with some further remarks. 

We leave it to the reader to show (see Problems 6.2-6.4) that in the case 
of B = B(0,1) being the unit disk and D = B(0, R) the disk of radius R < 1 
the ratios ($2, Pi) t2~ap)/>3 behave as (|z|/R)?/. Therefore, convergence holds 
if and only if |z| < R which confirms the assertion of the last theorem. 

In practice, only finitely many measurements are available; that is, the data 
operator A; — A is replaced by a matrix M € R™*"™. The question of conver- 
gence of the series is obsolete because the sum consists of only finitely many 
terms. However, in practice, it is observed that the value of this sum is much 
smaller for points z inside of D than for points outside of D. Some authors (see 
[123]) suggest to test the “convergence” by determining the slope of the straight 
line that best fits the curve j > In[(¢z, Ys )f2~By/>s] (for some j only). The 
points z for which the slope is negative provide a good picture of D. 

A rigorous justification of a projection method to approximate the (infinite) 
series by a (finite) sum has been given in [177]. 

In the implementation of the factorization method, only the relative data 
operator A; — A has to be known and no other information on D. For example, 
it is allowed (see Assumption 6.10) that D consist of several components. Fur- 
thermore, the fact that the medium D is penetrable is not used. If one imposes 
some boundary condition on OD, the same characterization as in Theorem 6.18 


6.5 Problems 237 


holds. For example, in [123], the factorization method has been justified for an 
insulating object D. In particular, the factorization method provides a proof of 
uniqueness of D independent of the nature of D; that is, whether it is finitely 
conducting, a perfect conductor (Dirichlet boundary conditions on 0D), a per- 
fect insulator (Neumann boundary conditions on 0D), or a boundary condition 
of Robin-type. 


6.5 Problems 


6.1 Let D be a domain with D C B and y € L®(B) piecewise constant 
with y = 79 in D for some y € Randy =1in B\D. Let ue 
H}(B) 1 C?(B \ OD) be a solution of the variational equation (6.4) and 
assume that u|p and u| pz have differentiable extensions to D and B\ D, 
respectively. 

Show that u solves Au = 0 in B\OD and Ou/Ov = f on OB and ul = ul_ 
on OD and yOu|_ /Ov = Ou|4/Ov on OD. 


Hint: Use Green’s first theorem. 
For the following problems let B be the unit disk in R? with center at the origin. 


6.2 Show that the fundamental solution ® and the Green’s function N are 
given in polar coordinates (x = r(cost,sint)' and z = p(cosT,sinT)') 
as 


®(z,z) = d Inr 4 : > (2)" cosn(t—r), 


r 


N(a,z) = O@(a#,z) + o(— 


for p = |z| < |z| =r. 


Hint: Write ® in the form 


1 1 py) 
® = ( ) 3 
(x, z) re Inr An In f 2 7 cos(t | 
and show )>°°, 4a" cos(ns) = —5 In[1 + a? — 2a cos s] by differenti- 


ation with respect to a and applying the geometric series formula for 
re, at exp(ins). 
6.3 Show that ¢, from (6.23) is given by 
a:(%-2z 
oe) = VED gat. 


mlx — z|? ’ 


Also compute ¢, in polar coordinates for a = (cosa,sina)' by the for- 
mulas of Problem 6.2. 


238 


6.4 


6. 


Or 
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Compute the eigenvalues A,, and the normalized eigenfunctions wp, € 
L2(0B) of Ay — Ay, and the coefficients (¢z,Wn)12(9B) for the case of 
Example 6.6. Compute the ratios (dz, Yn)ia(apy/An and validate the con- 
dition (6.29) of Theorem 6.18. 


Consider the case of D C B being the annulus D = {a € B: Ri < |a| < 
Rz} for some 0 < Ry < Ry < 1. Compute again the eigenvalues \,, and 
the normalized eigenfunctions w,, € L2(0B) of A; — A and the coefficients 
(¢z,Un)12(aB). Verify that you can only determine the outer boundary 
{ax : |x| = Ro} by the factorization method. 


Let f : [a,b] > Ryo be a Lipschitz continuous function; that is, f(a) > 0 
for all x € [a, b] and there exists L > 0 with | f(x) — f(y)| < Lla—y| for all 
x,y € [a,b]. Define D := {(x1, 22) € [a,b] x R: 0 < ao < f(x1) for x1 € 
[a, b]}. Show that this Lipschitz domain D C R? satisfies the exterior cone 
condition of Assumption 6.10. 


Check for 
updates 


Chapter 7 


An Inverse Scattering 
Problem 


7.1 Introduction 


We consider acoustic waves that travel in a medium such as a fluid. Let v(z, t) 
be the velocity vector of a particle at x € R° and time t. Let p(a,t), p(x, t), and 
S(ax,t) denote the pressure, density, and specific entropy, respectively, of the 
fluid. We assume that no exterior forces act on the fluid. Then the movement 
of the particle is described by the following equations. 


Ot +(v-V)u+yu+—Vp = 0 (Euler’s equation) , (7.1a) 
7p +div(pv) = 0 (continuity equation) , (7.1b) 

f(e,S) = p (equation of state) , (7.1c) 

“s +vu:-VS = 0 (adiabatic hypothesis) , (7.1d) 


where the function f depends on the fluid. y is a damping coefficient, which 
we assume to be piecewise constant. This system is nonlinear in the unknown 
functions v, p, p, and S. Let the stationary case be described by vg = 0, time- 
independent distributions p = po(x) and S = So(a), and constant po such that 
Po = f(po(x),So(x)). The linearization of this nonlinear system is given by 
the (directional) derivative of this system at (vo, Po, P0, 50). For deriving the 
linearization, we set 


v(z, t) = €,(z, t) + Ole’), 
p(z,t) = po + epi(z,t) + O(e*), 
p(x,t) = po(x) + epi(a,t) + O(c"), 
S(a,t) So{z) + €S1(x,t) + Ofe*), 
© Springer Nature Switzerland AG 2021 239 
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and we substitute this into (7.la), (7.1b), (7.1c), and (7.1d). Ignoring terms 
with O(c?) leads to the linear system 


O 1 
a ham + 7 Va = 0, (7.2a) 
7p + div(pu) = 0, (7.2b) 
Of (po, 8 Of (po, S 
fe 0) bu te Meo: 0) oe wt he (7.2) 
Os 
an + v14-VSo = O (7.2d) 
First, we eliminate S;. Because 
Of (po, S Of (po, S 
0 = Vf (p0(2), S0(2)) = os ) vp + Hoos )vsy, 
we conclude by differentiating (7.2c) with respect to ¢ and using (7.2d) 
0 0 
FE = ofa)? | + 1 Veo), (7.20) 


where the speed of sound c is defined by 
O 
ae) = ap! (Pols): So(z)). 


Now we eliminate v; and p; from the system. This can be achieved by differen- 
tiating (7.2e) with respect to time and using equations (7.2a) and (7.2b). This 
leads to the wave equation for py: 


Op, (a, t) Opi (a, t) 
ae) OF 


= c(x)* po(x) div Fs voi(e.t)| : (7.3) 


Now we assume that terms involving Vo are negligible and that p; is time- 
periodic; that is, of the form 


pi(z,t) = Re [u(x) e~*] 


with frequency w > 0 and a complex-valued function u = u(a) depending only 
on the spatial variable. Substituting this into the wave equation (7.3) yields the 
three-dimensional Helmholtz equation for u: 


In free space, c = co is constant and y = 0. We define the wave number and the 
index of refraction by 
% 


k= = > 0 and n(x) = 7h (14:2). (7.4) 
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The Helmholtz equation then takes the form 
Au + k’nu = 0 (7.5) 


where n is a complex-valued function with Ren(x) > 0 and Imn(z) > 0. 

This equation holds in every source-free domain in R?. We assume in this 
chapter that there exists a > 0 such that c(x) = cp and y(x) = 0 for all x with 
|x| > a; that is, n(v) = 1 for |z| > a. This means that the inhomogeneous 
medium {x € R® : n(x) 4 1} is bounded and contained in the ball B(0,a) := 
{y € R®: |y| < a} of radius a. By B[0,a] := {y € R®: |y| < a}, we denote its 
closure. We further assume that the sources lie outside the ball B[0, a]. 

These sources generate “incident” fields u’, that satisfy the unperturbed 
Helmholtz equation Au’ + k?u' = 0 outside the sources. In this introduction, 
we assume that u is either a point source or a plane wave; that is, the time- 
dependent incident fields have the form 


i ‘ : : ik|x—z| 
pi(a,t) = Re eikla—a|—sut, that is, u(x) = ere . 
je — 2| |x — 2| 
for a source at z € R°, or 
Di (x, t) = Re cikd-a—iwt, that is, u‘(x) = eikb-x 


for a unit vector 6 € R°. 

In any case, u® is a solution of the Helmholtz equation Au’ + k?u’ = 0 in 
R? \ {z} or R3, respectively. In the first case, pi describes a spherical wave that 
travels away from the source with velocity co. In the second case, p{ is a plane 
wave that travels in the direction 6 with velocity cp. 

The incident field is disturbed by the medium described by the index of 
refraction n and produces a “scattered wave” u*®. The total field u = u’ + u° 
satisfies the Helmholtz equation Au + k?nu = 0 outside the sources; that is, 
the scattered field u® satisfies the inhomogeneous equation 


Aut +k’nus = k2(1—n)u! (7.6) 


where the right-hand side is a function of compact support in B(0,a). Further- 
more, we expect the scattered field u® to behave as a spherical wave far away 
from the medium. This can be described by the following radiation condition 


Ous (x) 


eo iku8(z) = O(1/r?) as r= |z| — 00, (7.7) 


uniformly in x/|z| € S?. Here we denote by $? the unit sphere in R°. The 
smoothness of the solution u* depends on the smoothness of the refractive index 
n. We refer to the beginning of Subsection 7.2 for more details. We have now 
derived a (almost) complete description of the direct scattering problem. 

Let the wave number k > 0, the index of refraction n € L®(R*) with 
n(x) = 1 for |x| > a, and the incident field u‘ be given. Determine the scattered 
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field u® that satisfies the source equation (7.6) in some generalized sense (to 
make more precise later) and the radiation condition (7.7). 

In the inverse problem, one tries to determine the index of refraction n 
from measurements of the field u outside of B(0, a) for several different incident 
fields u’ and/or different wave numbers k. The following example shows that 
the radially symmetric case reduces to an ordinary differential equation. 


Example 7.1 
Let n = n(r) be radially symmetric: n is independent of the spherical coordi- 
nates. Because in spherical polar coordinates (r, ¢, 4), 


A= 10/.0 i 1 ar 1 Of 9 2 
72 Or \" Or r2sin2@ 0¢2 ° r?sin6 06 ae he 


the Helmholtz equation for radially symmetric u = u(r) reduces to the following 
ordinary differential equation of second order, 


J (u(r) + Knlryur) = 0; 
that is, 
u(r) + 2 u(r) + kn(r)u(r) = 0 forr>0. (7.8a) 


From the theory of linear ordinary differential equations of second order with 
singular coefficients, we know that in a neighborhood of r = 0 there exist two 
linearly independent solutions, a regular one and one with a singularity at r = 0. 
We construct them by making the substitution u(r) = v(r)/r in (7.8a). This 
yields the equation 


v'(r) + kn(r)o(r) = 0 forr>0. (7.8b) 


For the simplest case, where n(r) = 1, we readily see that u(r) = asin(kr)/r 
and ue(r) = Gcos(kr)/r are two linearly independent solutions. u, is regular 
and ug is singular at the origin. Neither of them satisfies the radiation condi- 
tion. However, the combination u(r) = yexp(ikr)/r does satisfy the radiation 
condition because 


Ties ao = o(5) 


r2 r2 


as is readily seen. For the case of arbitrary n, we construct a fundamental system 
{v1, v2} of (7.8b) (compare with Section 5.2); that is, vy and v2 satisfy (7.8b) 
with v1(0) = 0, v/ (0) = 1, and v2(0) = 1, v4(0) = 0. Then u(r) = v1 (r)/r is 
the regular and u2(r) = ve(r)/r is the singular solution. 

In the next section, we rigorously formulate the direct scattering problem 
and prove the uniqueness and existence of a solution. The basic ingredients 
for the uniqueness proof are a result by Rellich (see [222]) and a unique con- 
tinuation principle for solutions of the Helmholtz equation. We prove neither 
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Rellich’s lemma nor the general continuation principle, but rather give a sim- 
ple proof for a special case of a unique continuation principle that is sufficient 
for the uniqueness proof of the direct problem. This suggestion was made by 
Hahner (see [116]). We then show the equivalence of the scattering problem 
with an integral equation. Existence is then proven by an application of the 
Riesz theorem A.36 of Appendix A. Section 7.3 is devoted to the introduction 
of the far field patterns that describe the scattered fields “far away” from the 
medium. We collect some results on the far field operator, several of which 
are needed in Sections 7.5 and 7.7. The question of injectivity of the far field 
operator is closely related to an unusual eigenvalue problem which we call the 
interior transmission eigenvalue problem. We will investigate this eigenvalue 
problem in Section 7.6. In Section 7.4, we prove uniqueness of the inverse prob- 
lem. Section 7.5 is devoted to the factorization method which corresponds to 
the method in Section 6.4 and provides a very simple characterization of the 
support of the contrast by the far field patterns. This method is rigorously 
justified under the assumption that, again, the wavenumber is not an interior 
transmission eigenvalue. Since the interior transmission eigenvalue problem is 
also an interesting problem in itself and widely studied during the past decade 
we include Section 7.6 for some aspects of this eigenvalue problem. Finally, 
in Section 7.7, we present three classical numerical algorithms for solving the 
inverse scattering problem. 


7.2 The Direct Scattering Problem 


In this section, we collect properties of solutions to the Helmholtz equation that 
are needed later. We prove uniqueness and existence of the direct scattering 
problem and introduce the far field pattern. In the remaining part of this 
chapter, we restrict ourselves to scattering problems for plane incident fields. 

Throughout this chapter, we make the following assumptions. Let n € 
L©(R?) and a > 0 with n(x) = 1 for almost all |z| > a. Assume that 
Ren(z) > 0 and Imn(z) > 0 for almost all e € R?. Let k € R, k > 0, 
and 6 € R? with |6| = 1. We set u*(«) := exp(ik6-«) for 2 € R®. Then u‘ solves 
the Helmholtz equation 


Au’ + Fu? = 0 inR?®. (7.9) 


We again formulate the direct scattering problem. Given n, k, 6 satisfying the 
previous assumptions, determine u € H7,,(IR°) such that 


Au + k’nu = 0 inR?, (7.10) 
and u’ := u—u" satisfies the Sommerfeld radiation condition 
a — iku® = O(1/r?) for r =|2| > 00, (7.11) 
7 


uniformly in 2/|z| € S?. Since the index function n is not smooth we cannot 
expect that the solution u is smooth either. Rather, it belongs to the (local) 
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Sobolev space H7,,(IR®). We recall that for any open set 2 C R® and p € N the 
Sobolev space H?(Q) is defined as the completion of 


= glsl 
{u € CP(Q): anh xbox? L?(Q) for all 7 € Ng with |j|, < v} 
1 
with respect to the norm 
Alshu(ax 

pa 2, | eaposponm Ere: 
J 
gh <p 


Here we have set |j|1 = |ji| + |e] + |J3| for 7 = (31, 32,33) € Ne. We refer to 
Chapter 6 where we already used Sobolev spaces of functions on two-dimensional 
domains B. The local spaces H? (Q) are defined by 


loc 


HP 


loc 


)={u: QC: ulg € H?(B) for every bounded domain B with Bc}. 


For us, the spaces H+(Q) and H?(Q) (and their local analogies) are particularly 
ee We define the subspace H}(Q) of H?(Q) by the closure of the set 
C§(Q) = {y € CP(Q) : y has compact support in Q} in H?(Q). By definition 
it is ae and one can show that it is a strict subspace of H?(Q). The space 
H4(Q) models the class of functions which vanish on the boundary of 2 while 
for functions in HZ(Q) also the normal derivatives 0y)/Ov vanish at OQ. The 
following versions of Green’s theorem are not difficult to prove by approximating 
u and v by sequences of smooth functions. 


Lemma 7.2 Let 2 C R® be a bounded domain. Then 


[ludvt va Vo dx = 0 for allu € Hg(Q), v € H7(Q), (7.12a) 
2 
[[wAv-vau] dz = 0 foralluc H?(Q), v € HB(Q). (7.12b) 
2 


The application of (7.12a) to solutions of the Helmholtz equation yields the 
following version. 


Lemma 7.3 Let v,w € H?(B(0,b))) be solutions of the Helmholtz equation 
Au+k?u =0 in some annular region A = {x € R?: a < |z| < b}. Then v and 
w are analytic in A, and for every R € (a,b) it holds that 


[opa = [ (ve vwtvdul de. (7.13) 


|a|=R |al|<R 
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In particular, the total field u and thus the scattered field u’ = u — u’ are 
analytic for |z| > a, and the radiation condition (7.11) is well-defined. 


Proof: The smoothness of v and w follow from general regularity results for 
solutions of the Helmholtz equation Au + k?u = 0 and is not proven here, 
see, e.g., [161], Corollary 3.4, or [55], Theorem 2.2. To show (7.13) we choose 
p € C®(R*%) with compact support in B(0,b) such that p(x) = 1 for |x| < R. 
Then pv € Hj (B(0,b)) as easily seen and thus by (7.12a) 


0 = / [pu Aw + Vw: V(pv)] dx 
|z|<b 
— / [vAw+Vw- Vol] dz + / [pu Aw + Vw: V(pv)| dx. 
|2|<R R<|a|<b 


Because in the annular region {2 € R? : R < |x| < b} the functions v and w 
are smooth we apply the classical Green’s first formula which yields (note that 
p vanishes for |x| = b and is equal to one for |z| = R) 


[pu Aw+Vw-V(pv)| dx = — / ve as. 


R<|a|<b |z|=R 


This proves the assertion. 


We will also need the following characterization of H}(Q) which is not obvi- 
ous at all and holds only for domains 2 with sufficiently regular boundaries. 


Lemma 7.4 Let 2 C R® be a bounded Lipschitz domain.'! Then 


HG(Q) = f{ulo: ue H7(R®), u=0 inR®\Q}. 


We need some further results from the theory of the Helmholtz equation. We 
omit some of the proofs and refer to [161, 53, 55] for a detailed investigation of 
the direct scattering problems. The proof of uniqueness relies on the following 
very important theorem, which we state without proof. 


Lemma 7.5 (Rellich) 
Let u satisfy the Helmholtz equation Au + k?u = 0 for |z| > a. Assume, 
furthermore, that 


lim |u(a)|?ds(z) = 0. (7.14) 


Then u =0 for |x| > a. 


1For a definition see, e.g., [191, 161]. 
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For the proof, we refer to [161] (Lemma 3.21) or [55] (Lemma 2.12). In 
particular, the condition (7.14) of this lemma is satisfied if u(x) decays faster 
that 1/|z|. Note that the assertion of this lemma does not hold if the imaginary 
part of k is positive or if k = 0. 

The second important tool for proving uniqueness is the unique continuation 
principle. For the uniqueness proof, only a special case is sufficient. We present 
a simple proof by Hahner (see [116]), which is an application of the following 
result on periodic differential equations with constant coefficients. This lemma 
is also needed in the uniqueness proof for the inverse problem (see Section 7.4). 
First, we define the cube Q := (—7,7)? € R°. Then every element g € L?(Q) 
can be expanded into a Fourier series in the form 


gz) = So get**, weR’, (7.15a) 
jeEZ 
with Fourier coefficients 
1 . 
eee ify eZ. 1 
9 = Tryp ame dy, j€ (7.15b) 
Q 


The convergence of the series is understood in the L?-sense. (See Section A.2 
of the Appendix.) Then Parseval’s equation holds in the form 


(2n)* > [oil? = f la(w)Pay. (7.15¢) 
jeZs Q 

In particular, L*(Q) can be defined by those functions g such that })j¢zs |gj/? 

converges. Analogously, as in the one dimensional case of Section A.4 of the 

Appendix, for p € N one defines the Sobolev space H?.,.(Q) of periodic functions 

by 


HE(Q) = {9 € £°(Q) :lalep.cqy = So l+ GPP las? < oo}. 


jeZ3 


Here we have set |j] = /j-j = V9? +93 + 3§ for j = (j1, 2, Jj3) € Z. Note 
that ||g||z2(@) = (27)*/? ||gll0,,.(@)- Then it is not difficult to show (see Prob- 
lem 7.1) that HP.,.(Q) C H?(Q) and H§(Q) C HP.,.(Q). Furthermore, we iden- 
tify L?(Q) and HP.,.(Q) with the spaces of 27-periodic functions on R* with 


respect to all variables: they satisfy g(27j + x) = g(x) for almost all 2 € R® 
and j € Z?. 


Lemma 7.6 Let p € R°, a€ R, and é= (1,i,0)' € C?. Then, for every t > 0 
and every g € L?(Q), there exists a unique solution w = w;:(g) € Hp.,(Q) of 
the differential equation 


Aw + (2té—ip)-Vw — (itta)w = g inR’. (7.16) 


Furthermore, the following estimate holds 


1 
\|w|| z2(@) < - llg|lz2(@) for all gE ): t>0. (7.17) 
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In other words, there exists a linear and bounded solution operator 
Ly: DPQ)> LQ), grwug), 
of (7.16) with the property ||L:||c(z2(q@y) < 1/t for all t > 0. 


Proof: | We expand g into the Fourier series (7.15a) with Fourier coefficients 
(7.15b). The representation w(x) = >) j¢z3 w; exp(tj- z) leads to the equation 


w; (lj? + ig-(2té-ip) — (it+a)] = g;, jeZ, 
for the coefficients w;. For fixed t > 0 we estimate 
|-l9/? + 27-(2té-ip) — (it +a)| 
> |Rel---]| = |? +2t2-j-ptal > 5+U7] 
for all 7 € Z? with |j| > jo for some jo € N. Furthermore, 
|-ls? + ig-(2t@—-ip) — (it+e)| > |Inf[---]] = t)29,-1) > t 
for all j € Z? and t > 0. Therefore, 


= 95 
—|j|? + ij-(ip+2té) — (it+a) 


Wy 
are well defined for all j € Z° and w € H;.,.(Q) because 
do D+ PP les? < 4 SO Io. 
lJ|2Jo |j|2Jo 


Furthermore, the solution operator 


= Gj ijea 2 
ake ar Pap poe) — Gea 2 ee) 


is bounded from L?(Q) into itself with ||L;||c(z2(q)) < 1/t for every t > 0. 


Now we can give a simple proof of the following version of a unique contin- 
uation principle. 


Theorem 7.7 Let n € L®(R*) with n(x) = 1 for |x| > a be given. Let u € 
H?(R?) be a solution of the Helmholtz equation Au+k?nu = 0 in R? such that 
u(x) = 0 for all |x| > b for some b> a. Then u has to vanish in all of R°. 


Proof: Define é = (1,7,0)' € C® as before, set p = 2b/7, and define the 
function : 
w(2) = &?™ tr a(pr), xe Q:= (-1,7)%, 
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for some t > 0. Then w(x) = 0 for all |x| > 7/2, in particular near the boundary 
of the cube Q. Extend w to a 27-periodic function in R® by w(27j +2) := w(x) 
for « € Q and all j € Z*®, j 4 0. Then w € H>.,(Q), and w satisfies the 
differential equation 


Aw + (2té—ip).Vw — (it+1/4)w = —p’k?Aw 


Here, we have set p = (1,0,0)' and n(27j + x) := n(px) for almost all x € 
[—7,7]° and j € Z°. Application of the previous lemma to this differential 
equation yields the existence of a linear bounded operator L; from L?(Q) into 
itself with ||Lz||c(z2(@)) < 1/t such that the differential equation is equivalent 
to 

w = —p*k? L, (fw) . 


Estimating 


21.2 2,.2 
por is prk*||ni| 
|wl|z2q@) < 7 [Pelle < > |lewllz2¢@) 


yields w = 0 for sufficiently large t > 0. Thus, also u has to vanish. 


The preceding theorem is a special case of a far more general unique contin- 
uation principle, which we formulate without proof here. 

Let u € H7,,(Q) be a solution of the Helmholtz equation Au+k?nu = 0 in a 
domain 2 Cc R? (i.e., Q is open and connected). Furthermore, let n € L°(Q) 
and u(x) =0 on some open set. Then u=O0 in all of Q. 

For a proof we refer to, for example, [55]. 

Now we can prove the following uniqueness result. 


Theorem 7.8 (Uniqueness) 
The problem (7.10), (7.11) has at most one solution; that is, if u is a solution 
corresponding to u' = 0, then u=0. 


Proof: Let u! =0. The radiation condition (7.11) yields 


Or) = [ [teen as (7.18) 
|zl=R 
2 = 
= 7 (- + hu) ds + 2k Im / was. 
r r 
|z|=R |= 


We transform the last integral using Green’s formula (7.13) for v = u and w = J; 


that is, 
OU = 
/ ua. ds = / [|Vul? — k?70 |ul?] dx 


|z|=R |a|<R 
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and thus 
OU 2 2 
Im Ua ds =k Imn|ul“dx > 0. 
r 
|jz|=R |a|<R 


We substitute this into (7.18) and let R tend to infinity. This yields 


0 < limsup / (= 
R-0o Or 


|2|=R 


2 
+?) ds < 0, 


and thus 
|ul?ds —> 0 asR-+oo. 


|a|=R 


Rellich’s Lemma 7.5 implies u = 0 for |z| > a. Finally, the unique continuation 
principle of Theorem 7.7 yields u = 0 in R®. 


Now let 
etkle—yl 


O(z,y) := for z,y € R®, «Fy, (7.19) 


A4n|x — y| 
be the fundamental solution or free space Green’s function of the Helmholtz 
equation. Properties of the fundamental solution are summarized in the follow- 
ing theorem. 


Theorem 7.9 ®(-,y) solves the Helmholtz equation Au + k?u = 0 in R3 \ {y} 
for every y € R°. It satisfies the radiation condition 
x 


ial Vi (x,y) — ik ®(x,y) = O(1/|x|*) 


uniformly in x/|x| € S? and y € Y for every bounded subset Y Cc R°. In 
addition, 
etkla| 


B(r,y) = fatal e Rey 4 O(1/|z|") (7.20) 


uniformly in & = el € S? andy €Y. 


The proof is not difficult and is left to the reader. 


Before we turn to the question of existence we prove a general regularity 
result. 


Lemma 7.10 Let D C R?® be some bounded domain, n € L®(D), and f € 
12(D). 
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(a) Let w € L?(D) be an ultra-weak solution of Aw + k?nw = f in D; that 
is, let w satisfy 


[we Qe+ enw) dx = [ive (7.21) 
D 


D 


for all € C?(D) with compact support in D. Thenw € HH? ..D)\; thatis, 
w € H(A) for all domains A with A C D. Furthermore, Aw + k?nw = 
f almost everywhere in D, and for every domain A with A C D there 
exists c > 0 (depending only on n, D, and A) such that ||w||H2(4) < 
e[Ilfllz2~ay + lvllz2(o)]- 


(b) Every solution w € H?(D) of Aw+k?nw = f in D is also an ultra-weak 
solution. 


Proof: (a) Let A be any open bounded set such that A Cc D. We choose 
p € C™(D) with compact support in D such that p = 1 on A. Furthermore, let 
Q be a cube containing D in its interior. For any ¢ € C®(R?) we take w = p@ 
in (7.21). With Aw = pAd+2Vp-V¢+ ¢Ap we have 


[ow [o—Adlae 


2 


[te [(k?n + Ll po+2Vp-Vo+ Ap] — fp do} de (7.22) 
D 


I 


[lge+h Vo] ae 


D 


with g = w[(k?n + 1)p+ Ap| — fp and h = 2wVp. We can replace the region 
of integration by Q because p vanishes in D \ Q. Without loss of generality we 
assume that Q = (—7,7). Since g,h € L?(Q) we expand them into Fourier 
as 92) = Viezs aye * and h(x) = Yiiegs as with g; € C and h; € 
C* such that 7 jcz3 |gj|° < co and )) 573 |hy|" < co and make the ansatz 
(pw) (x) = Djeqs we?” in Q. We take ¢(x) = e~’® for some £ € Z? in (7.22) 
and have (1 + |é|?) we = ge —i€- he and thus 


(1+ |€)?) |wel? < 


2 
S Typ llsel” + elel"] < 2 [lael? + lel") (7.28) 


Therefore, pw € H}..(Q) and 


per 


|| w|| 2A) < G||lewllm (Q) S ca[|lgllz2(q) + llAllz2(@)] 


per 


< es[llwllz2(p) + IIfllz2~y] - 


Since A was arbitrary we have shown that w € Hji,.(D). Now we repeat the 


first part of the proof but apply Green’s first formula to the second term of the 
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right hand side of (7.22). This yields 


fowls- Ag] d ae k?n + 1)p —2 div(wVp) + wAp— fp] dz. 
Q — 


Now we argue in the same way but with h = 0. Estimate (7.23) yields (1 + 
|¢|?)? |we|? < 2|ge|? and thus pw € H?.,.(Q) and 


|wila2cay  S< eallewlla2. (ay < eallgllz2(a) 


< c3[|w|l22(p) + |[Vw||r2¢B) + If llz2~D)] 
where B is the support of p. Now we substitute the estimate for ||w||q1:B) 
which yields the desired estimate. 
(b) This follows directly from Green’s theorem in the form (7.12b). 


Now we construct volume potentials with the fundamental solution (7.19). 


Theorem 7.11 Let Q C R® be a bounded domain. For every ¢ € L?(Q) the 
volume potential 


= [oo G(x,y)dy, «ER, (7.24) 


yields a function v € H7,,(IR°) that satisfies the radiation condition (7.11) and 
is the only radiating® solution of Av + k?v = —¢. 

Furthermore, for every ball B = B(0,R) containing Q in its interior there 
exists c > 0 (only dependent on B, k, and Q) such that 

lvlla2cBy < ellllzaay- (7.25) 

Proof: First we state without proof (see, e.g., [161], Theorem 3.9) that for 
any k € C and ¢ € C4 (Q) ; that is, ¢ € C'(Q) with compact support in Q, the 
potential v is in C?(IR°) and solves Av + k?v = —¢ in Q and Av + k?v = 0 in 
the exterior of 2. 


Second, we fix ¢ € L?(Q) and choose a sequence ¢; € Cj(Q) which converges to 
@ in L?(Q). Let v and vu; be the corresponding potentials, and let ~ € C™(R?) 


some test function with compact support. Then Av, + k?v; = —¢; in R® and 
thus by Green’s second theorem 

ic (Aq + ka) d oe as 

R3 


Let B(0, R) be a ball that contains the support of 7). From the boundedness of 
the volume integral operator from L?(Q2) into L?(B(0, R)) we conclude that vj 
converges to v in L?(B(0, R)) as j tends to infinity. Therefore, 


[eau + Bede = a eve 


R3 


*that is, it satisfies the radiation condition (7.11) 
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for all 7 € C™(R%) with compact support. Therefore, v is an ultra-weak solu- 
tion of Av + k?v = —¢ in R°. The regularity result of Lemma 7.10 applied to 
D = B(0,R +1) yields v € H7,,.(B(0, R + 1)) and the estimate 


lwllz2cb0,R) < c [llullzeceo.rey + Il¢llz2@] < © |Idllz2~@) 


where we used again the boundedness of the volume potential from L?(Q) to 
L?(B(0,R+1)). 


Now we can transform the scattering problem into a Fredholm integral equa- 
tion of the second kind. The following theorem is needed quite often later on. 


Theorem 7.12 (a) Let u € H7,.(IR°) be a solution of the scattering problem 
(7.10), (7.11). Then u|po,a) belongs to L?(B(0,a)) and solves the Lippmann— 
Schwinger integral equation 


u(z) = ul(x) — ef (1 — n(y)) O(xz,y)u(y)dy, «€ B(O,a). (7.26) 
lul<a 


(b) If, on the other hand, u € L?(B(0,a)) is a solution of the integral equation 
bi 26), then u can be extended by the right-hand side of (7.26) to a solution 
H7,.(R®) of the scattering problem (7.10), (7.11). 


ne (a) Let wu be a solution of (7.10), (7.11) and v the volume potential 
with density k?(1—n)u € L?(B(0,a)). By Theorem 7.11 we conclude that 
v € H7,,(R°) and Av + k?u = k?(n —1)u. From Au + k?u = k?(1 — n) u and 
ie + k?ut = 0 we conclude that A(v + u’) + k?(u + us) = 0. Furthermore, v 
and u® both satisfy the radiation condition (7.11). The uniqueness Theorem 7.8 
yields that v + u® = 0, thus u = u’ + u’ = u’ — v. This proves the first part. 
(b) Let u € L?(B(0,a)) be a solution of (7.26). Again define v as the vol- 
ume potential with density k?(1—n)u € L?(B(0,a)). Then u = u' — v in 
B(0,a). Extend wu by the right-hand side of this formula to all of R°. Again, 
by Theorem 7.11, we conclude that v € H7,.(IR®) and Av + k?v = k?(n —1)u 
Therefore, also u € H7?,.(R°) and Au + k?u = —(Av + kv) = k?(1 — n) u; that 
is, Au+k?nu = 0. Therefore, u® = —v, which ends the proof. 


As a corollary, we derive the following result on existence. 


Theorem 7.13 Under the given assumptions on k, n, and 6, there exists a 
unique solution u of the scattering problem (7.10), (7.11) or, equivalently, the 
integral equation (7.26). 


Proof: We apply Ul the Riesz theory (Theorem A.36 of Appendix A) to the 
integral equation u = u’ — Tu, where the operator T from L?(B(0,a)) into 
itself is defined by 


(Tu)(x) := k? / (1 — n(y)) O(a, y)uly)dy, |al<a. (7.27) 


ly|<a 
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This integral operator is compact. There are several ways to prove this. The 
simplest is perhaps the observation that this integral operator is bounded from 
L? (B(0,a)) into the Sobolev space H?(B(0,a)) by Theorem 7.11. Furthermore, 
by Rellich’s embedding theorem (see [1, 191]) the Sobolev space H?(B(0,a)) is 
compactly embedded in L? (B (0, a)). One can also argue directly by observing 
that the kernel ®(x,y) of this integral operator is weakly singular (see Theo- 
rem A.35 of Appendix A for the one-dimensional case). Therefore, it is sufficient 
to prove uniqueness of a solution to (7.26). This follows by Theorems 7.12 and 
PS. 


Remark 7.14 From the proof we observe directly that the operator I+T is an 
isomorphism from L? (B(0,a)) onto itself. 


As another application of the Lippmann—Schwinger integral equation, we 
derive the following asymptotic behavior of wu. 


Theorem 7.15 Let u be the solution of the scattering problem (7.10), (7.11). 
Then 


etkla| 
|z| 


uniformly in & = x/\x|, where 


Uoo(#) + O(1/|z|?) as |x| 4 00 (7.28) 


u(x) = u(x) + 


k? 


Uso(#) = 


/ (n(y) — 1) e *tYu(y)dy for é€ S?. (7.29) 


The function Us. : 8? — C is called the far field pattern or scattering amplitude 
of u. It is analytic on S? and determines u’ outside of B(0,a) uniquely; that 
18, Uso = 0 on S? if and only if u(x) = 0 for |x| > a. 


Proof: — Formulas (7.28) and (7.29) follow directly from the asymptotic behav- 
ior (7.20) of the fundamental solution ®. The analyticity of u. follows from 
(7.29). Finally, if u. = 0, then an application of Rellich’s lemma yields that 
us =u—u' =0 for all |z| > a. 


The existence of a far field pattern; that is, a function ug with 


tk |x 

w(x) = 7 Uco(2) + O(1/|2|?) as |x| 4 00, (7.30) 
is not restricted to scattering problems. Indeed, Theorem 7.16 below assures the 
existence of the far field pattern for every radiating solution of the Helmholtz 

equation. 
We now draw some further conclusions from the Lippmann—Schwinger inte- 
eral equation u+ Tu = u'. First we note that we can also treat the integral 
equation in L°(B(0,a)) or even in C(B[0,a}) because the volume potential 
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maps L°°-functions u into continuous functions. In the following we consider T 
as an operator from C(B[0,a]) into itself. We estimate the norm ||T'\| c(c(B(0,a))) 
of the integral operator T of (7.27) with respect to the sup-norm: 


(Tu)(a)| < [1 — nllco|ltlloo max / |®(x,y)|dy for a € BIO, a); 


lyl<a 
that is, 
1 
Tllo <. kl\1—nl|,, max / —— d 
ITI [tno max fe — dy 
ly|<a 
(ka)? 
= QT lh allo; (7.31) 


see Problem 7.4. We conclude that ||T||¢(c(B(o,a))) < 1, provided (ka)?||1 — 
N\loo < 2. The contraction mapping Theorem A.31 yields uniqueness and exis- 
tence of a solution of the integral equation (7.26) for (ka)?||1 — nll. < 2. We 
know this already even for all values of (ka)?||1 — nll. But Theorem A.31 
also tells us that for (ka)?||1 — n||.. < 2 the solution can be represented as a 
Neumann series in the form 


—— yey Tou’. (7.32) 


The first two terms of the series are 


w(x) := u(x) — k? / (1 —n(y)) u'(y) ®(z,y) dy, xeER?. (7.33) 
lyl<a 
u? is called the Born approximation. It provides a good approximation to u in 
B{0, a] for small values of (ka)?||1 — n||.o because 


~ i ia 1 (ka)* 
lu ulllc <P UNTMellu'lle = Ile g—FPS S Wyle 
j=2 a 


for (ka)?||1 — nlloo <1. 


The far field pattern depends on both, the direction @ € S? of observation 
and the direction 6 € S? of the incident field u’. Therefore, we often write 
Uoo (4; 6) to indicate this dependence. For the Born approximation, we see from 
the asymptotic form (7.20) of ®(x, y) that 


P k? ah Pe 
ba iké- tke: 
Use (430) = i [ow —1)e""%e ¥ dy 
R3 
k? ik(O—# 
= ae [ow - 1) ek O-#)-y dy , (7.34) 


R3 
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and this is just the Fourier transform of m:=n-— 1: 
ub(#;6) = Tpit (ke — kb), #068", (7.35) 
where the Fourier transform is defined by 


f'(@) := / fy)e*%dy, ER. 
RS 


From this, the reciprocity principle follows: 


ue,(—6;-2) = u®(#;6) for #,6 € S?. (7.36) 


We show that this relation holds for u. itself. Before we can prove this principle 
for Uo, we need the important Green’s representation theorem which expresses 
radiating solutions of the Helmholtz equation in terms of the Dirichlet and 
Neumann boundary data. 


Theorem 7.16 (Green’s representation theorem) 

Let Q C R® be a bounded domain and Q° := R*\ Q its exterior. Let the 
boundary OQ. be sufficiently smooth so that Gauss’ theorem holds. Let the unit 
normal vector v(x) in x € OO. be directed into the exterior of Q. 


(a) Let u € C?(Q)N CQ). Then 


ue) =f] een) Zu — uly) zo —o(e.0)| atu) 


V 
0a. 


- [ren [Au(y) +k?u(y)] dy, «ea. (7.37a) 
Q 


(b) Let u® € C?(Q°) N C12) be a solution of the Helmholtz equation Au’ + 
k?us =0 in N°, and let us satisfy the radiation condition (7.11). Then 


6(0,) 2a -w 2 a@,))do =f % see (7.37b) 
i av dv (2), 2. 


0Q. 


The far field pattern of u® has the representation 


OO _ina. _ihe-y O 
€ tké-y _ 6 iké-y 


wot) = = f [ew arn Se] aw) (738) 


for @€ 82. 


For a proof, we refer to [161], Theorems 3.3 and 3.6, or [55], Theorems 2.1 
and 2.5. As a corollary, we prove the following useful lemma. 
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Lemma 7.17 Let 2 € R® be a domain that is decomposed into two disjoint 
subdomains: 2 =Q,; UQ»2 such that AN OQ, =. Let the boundaries 00, and 
OQ2z be smooth (i.e., C7). Let uj € C?(Q;)N C1(Q;) for 7 = 1,2 be solutions 
of the Helmholtz equation Au; + ku; = 0 inQ;. Furthermore, let uy = ue 
on T and Ou,/Ov = Ou2/Ov on T, where T denotes the common boundary 
T := 0Q1, M OQ2. Then the function u, defined by 


can be extended to an analytic function in Q that satisfies the Helmholtz equation 
Aut k?u=0 inQ. 


Proof: It follows from Green’s representation theorem that u, and uz are 
analytic in Q; and Qz, respectively. We fix 29 € TQ and choose a small ball 
B(ao,¢) that is entirely contained in Q. Let B; := B(xo,¢)NQ;, j =1,2, and 
az € B,. We apply Green’s representation theorems to wu, in B, and to ug in Bo 
and arrive at 


O O 
u(r) = O(x,y) -—u(y) — u(y) =~ @(2,y)| ds(y), xe Bi, 
i | av avy) | 
o= | [Con Furl) - wl) zr ele.v)| es ee By: 
OB2 


We add both equations and note that the contributions on TN B(x, ¢€) cancel. 
This yields 


u1(x) = / Co 5 H(e.0)| ds(y), wv € By. 


OB(a2o,€) 


Interchanging the roles of 7 = 1 and j = 2 yields 


woe) =f [Bev Zuly) ue) ye Oew)] dow), we Bo 
OB(20,€) 


The right-hand side defines an analytic function in B(xo,¢). 


7.3 Properties of the Far Field Patterns 


First, we prove a reciprocity principle for ua. It states the (physically obvious) 
fact that it is the same if we illuminate an object from the direction 6 and 
observe it in the direction —% or the other way around: illumination from ¢ and 
observation in —6. 
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Theorem 7.18 (Reciprocity principle) 
Let Uso (&;0) be the far field pattern corresponding to the direction & of obser- 
vation and the direction 0 of the incident plane wave. Then 


Usc(#;0) = Uco(—6;-#) for all 2,6 € S?. (7.39) 


Proof: First we observe that the solutions u and thus also u* of the scat- 
tering problems are analytic outside of the ball B(0,a). Therefore, the Green’s 
theorems and also Theorem 7.16 are applicable. Application of Green’s sec- 
ond formula to u’ and u® in the interior and exterior of {x € R® : |z| = a}, 
respectively, yields 


0 =f fed) Futura) — wus -a) Fu'usd)] aoe, 
lyl=a 

0 = f lewd Suu) -wu—a Fund] asty) 
lyl=a 


(More precisely, to prove the second equation, one applies Green’s second for- 
mula to u* in the region {x € R®: a < |z| < R} with R >a and lets R tend to 
infinity. ) 

Now we use the representations (7.38) for the far field patterns us(4; 6) and 


Uso (—8; —2): 


At Us (2:0) = / u*(y; @) 


Ov V 
lyl|=a 
ATUco(—6; -—#) = [ [wea Zu) wu) Zura) asc. 
lyl|=a 


We subtract the last of these equations from the sum of the first three. This 
yields 


) Fulys6)| astu). 


II 
i 
a 

Ss 
= 
2| a 
= 
Ss 
&> 
~~" 
~ 
Y aoe 

= 
&®> 


lyl=a 
We have to show that this expression vanishes. But this follows directly from 
subtracting Green’s formula (7.13) applied to (v,w) = (u(-;@),u(-; -—£)) from 
the one applied to (v, w) = (u(-; —£), u(-; 4)). 


The far field patterns Uso (4; 6), £,0 € $, define the integral operator 


(Fg)(z) = / Uoo(#; 6) 9(8) ds(0) for # € S?, (7.40) 
S2 
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which we call the far field operator. It is certainly compact in L?($7) and is 
related to the scattering operator S : L?(S”) + L?(S?) by 
Lk; 
S=I+ =F. 
21 


The next results prove some properties of these operators. Some of them are 
important in Sections 7.5 and 7.7. We begin with a technical lemma (see [54]). 


Lemma 7.19 For g,h € L?(S7), define the Herglotz wave functions v’ and w* 


by 
vi(r) = / etke-6 0/6) ds(6), 2 ER, (7.41a) 
S2 
w(x) = / et*@ ni) ds(6), ce R3, (7.41b) 
S2 


respectively. Let v and w be the solutions of the scattering problem (7.10), (7.11) 
corresponding to incident fields v' and w', respectively. Then 
ik? (Im n) v © da 
B(0,a) 


— 27(Fg, h) 12(82) - 2n(g9, Fh) 12(s2) - ik(F'g, Fh) 12:92) - (7.42) 


Proof: Let v’ = v—v' and w* = w — w' denote the scattered fields with 
far field patterns vg, and wo, respectively. Then, by linearity, vu. = Fg and 
Woo = Fh. Green’s formula in the form (7.13) yields 


_ Ov = a 
[ eRe = [ [ve Vo-enow ae. 


|a|=a |a|<a 
Now we interchange the roles of v and @ which yields 
aw 
/ yds = / [Vu -Vw- kv | dx. 
Ov 
|a|=a |2|<a 
Subtracting the results yields 
Qik? / (Im n) vW dx = i c ae — oo ds. 
B(0,a) |z|=a 


The integral on the right-hand side is split into four parts by decomposing 
v=v'+v% and w=w'+w’. The integral 


/ OU 4 OU d 
"Op pe | 


|z|=a 
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vanishes by Green’s second formula because v' and Ww’ are solutions of the 
Helmholtz equation Au + k?u = 0. We write 


Te as 
ar. alae = wary Ou ce 


|z|=a |z|=R 


and note that by the radiation condition (7.11) and the form (7.30) 


: ws (x) —— Ov'(x) Qik ——— 3 
v* (x) oe (x) qe Voo(#) Woo (#) + O(1/r?). 
From this 
, Ow __, Ov* ; bee : 
c ay w oa ds — 2ik fv Woods = —2ik (Fg, Fh) 12:92) 
|el=R 8 


follows as R tends to infinity. Finally, we use the definition of v’ and w* and 
the representation (7.38) to compute 


/ : OW Ou! d 
"Oy oe Be| 


|z|=a 
= (9) / | ikxz-0 outs) = s(x) Sd ds( ) ds(@) 
S? |z|=a 
= —4n | g(6)wo(6) d(8) = —4n(g, Fh) 12(82) 
S2 


Analogously, we have that 


, Ow __, Ov* 
/ c av — Ww av ds = 4n( Fg, h) 12:92) - 


|z|=a 


This ends the proof. 


We can now give a simple proof of the unitarity of the scattering operator 
for real-valued n. 


Theorem 7.20 Let n € L®(R*) be real-valued such that the support of n—1 is 
contained in B(0,a). Then F is normal (i.e., F*F = F F*), and the scattering 
operator S := I + (ik) /(27) F is unitary (i.e., S*S =SS* = 1). 


Proof: The preceding lemma implies that 


ik( Fg, Fh) 12:92) — Q7r(Fg, h) 12(82) = 2r(g, Fh) p2(82) (7.43) 
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for all g,h € L?(S). By reciprocity (Theorem 7.18), we conclude that 


(F"g)(@) = [GA 96) as(6) = [nce —6) g(6) ds(6) 


S2 


and thus F*g = RFRg, where (Rh)(%) := h(—#) for ¢ € S$”. Noting that 

(Rg, Rh) 12(82) => (9, h) 12(82) => (h, 9) £2(82) for all g,h €E iS") and using 

(7.43) twice, we conclude that 

ik (F™h, F"g) 12(82) = ik (RF RG, RFRh) 12/82) = tk (FRG, FRh) 12:2) 
= Qn (FRG, Rh) 12/82) _ Qn (RG, FRA) 12/82) 

27 (RFRG,h) 12:82) — 2@n(G, RFRh) 12(s2) 

= 2r (h, F*g) 12(82) == Qn (F*h, 9) 12(82) 

2a (Fh, 9) 12(92) _ 2r (h, Fg) 1292) 

= ik (Fh, Fg) 1252) - 


This holds for all g,h € L?(S); thus F* F = F F*. 
Finally, from (7.43), we conclude that 


—(9,ikF* Fh) ra(s2) = 20(9,(F* — F)h)p2¢g2) for all gh € L7(S”); 


that is, ikF* F = 2x(F — F*). This formula, together with the normality of F, 
yields S*S = SS* =I by substituting the definition of S into S*S and SS*. 


It is well known that the eigenvalues of unitary operators all lie on the unit 
circle in C. From the definition S = I + (ik)/(27)F, we conclude that the 
eigenvalues of F lie on the circle | 2a /k — z| = 2r/k with center 27i/k and 
radius 27/k. We later show (Lemma 7.36) that the eigenvalues tend to zero 
from the right half of this circle. These properties hold for real-valued indices 
of refraction n. For further results for absorbing media (i.e., for which n is 
complex-valued), we refer to the original literature [54]. 


A number of numerical methods for determining the shape D of the support 
of the contrast n — 1, for example, the dual space method by Colton and Monk 
(or “superposition of incident fields”, see [59, 55]) or the linear sampling method 
(see [51, 160]) study the question of unique solvability of the far field equation 
Fg = f; that is, 


/ too (#34) 9(6) ds(8) = f(a), #ES?, 


S2 
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for different right-hand sides f. The question of injectivity of the far field 
operator F' is particularly important. We show that the null space of F is char- 
acterized by the following unusual eigenvalue problem, the interior transmission 
eigenvalue problem which will be the subject of investigation in Section 7.6. Let 
D be some bounded Lipschitz domain that contains the support of m = n-— 1. 


Interior Transmission Eigenvalue Problem: Determine k > 0 and v,w € L?(D), 
(v, w) 4 (0,0), such that 


Av+kvu=0inD, Aw+k*nw = 0inD, (7.44a) 
Ov Ow 
v = wondD, ay = Op oD. (7.44b) 


We also consider an inhomogeneous version of this system: 


Interior Transmission Problem: Given f,g € L?(0D), determine v, w € L?(D) 
such that 


Av+kvu=0inD, Aw+k?nw=0inD, (7.45a) 
w—-v = fondD, YEP a ett (7.45b) 
Ov Ov 


The solutions of (7.44a), (7.44b) and (7.45a), (7.45b), respectively, have to be 
understood in the ultra-weak sense. To motivate the formulation we multiply 
the equation for v and w with ¢ € C?(D) and ~ € C?(D), respectively, where 
@ = w and 0¢/0v = OW/dv on OD, and apply Green’s second formula in D 


formally. 
[Av + Baw) wae = [wee Z| ie 
a aD 
[(Ao+ 0) ode = [lz#- | — f[pe-¥F| ds. 
D aD a 


Subtraction and insertion of the boundary conditions yields the following form. 


[Av + ny) wae - [(Ae+ Wo) ude 7 [lesbo ds 
D oD 


D 


for all 6,» € C?(D) with ¢ = W and 0¢/dv = OW/dv on OD. We take this form 
as the definition of an ultra-weak solution of (7.45a), (7.45b) (and, analogously, 
of (7.44a), (7.44b) for f = g =0). 


Definition 7.21 Let D be a bounded Lipschitz domain. 


(a) The wave number k is an interior transmission eigenvalue if there exists 
no nontrivial pair (v,w) € L?(D) x L?(D) of fields that satisfies (7.444) 
and (7.44b) in the ultra-weak sense; that is, 


[Qv+ env) wde - [(Ae+ #0) ode = 0 (7.46a) 


D D 
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for all ¢,w € H?(D) with 6-w € H@(D). 
(b) Let f,g € L?(OD). The pair (v,w) € L?(D) x L?(D) is called an ultra- 
weak solution of (7.45a), (7.45b) if 


[(Avte ny) wde = [(Ae+k0) ode = / s5 -a¥) ds (7.46b) 


D D oD 


for all ¢,w € C?(D) such that @—w has compact support in D. 


Note that in part (b) we replaced the H?—test functions with smooth test 
functions to avoid the notion of traces of H?—functions.? However, if one likes 
to use the trace theorem then one can take the same test functions as in part (a) 
(density argument). 

We show the following theorem (see [61, 62, 154]). 


Theorem 7.22 Let D C R® be a bounded Lipschitz domain such that the exte- 
rior of D is connected and n = 1 outside of D. 


(a) g € L?(S?) is a solution of the homogeneous integral equation 


/ Uso (#; 4) (6) ds(6) = 0, #€S?, (7.47) 


if and only if there exist v,w € L?(D) such that (v,w) solve (7.44a), 
(7.440) in the ultra-weak sense of (7.46a), and v is the Herglotz wave 
function defined by 


v(a) = fon g(g) ds(g), «ER?. (7.48) 
S2 


In particular, F is one-to-one if the system (7.44a), (7.446) is only solv- 
able by the trivial solution v = w = 0 in D; that is, if k is not an interior 
transmission eigenvalue. 


(b) Let z € D be fixed. The integral equation 


/ tio (#; 6) g(6) ds(6) = e“#**, ge 82, (7.49) 
S2 


of the first kind is solvable in L?(S”) if and only if the interior transmis- 
ston problem 


3We mention that the unit normal vector v(«) exists for almost all « € OD and defines a 
L°—vector field. Therefore, the integral is well defined. 
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Av+ ku =0mD, Aw + k?nw = 0 in D, (7.50a) 
w(a) — v(a#) = ae on OD, (7.50b) 
L-2 
Ow(x Ov(a O exp(ik|x — z 
ut ) ut ) 7 “ k Z ) on OD,  (7.50c) 


has an ultra-weak solution w,v € L?(D) in the sense of (7.46b), and v is 
of the form (7.48). 


(c) For z € D the integral equation (7.49) is never solvable in L?(S?). 


Proof: (a) Let g € L?(S?) be a solution of (7.47) and define v by (7.48). We 
observe that the left-hand side of (7.47) is a superposition of far field patterns. 
Therefore, the far field pattern w. of the scattered field w*® that corresponds to 
the incident field w* = v vanishes. The corresponding total field w = w’ + v € 
H7.,.(R®) satisfies the Helmholtz equation Aw + k?nw = 0 in R3. By Rellich’s 
lemma (Lemma 7.5), the scattered field w* = w — v vanishes outside of D 
and thus w —v € H}§(D) by Lemma 7.4. Let now ¢,~ € H?(D) such that 
o—w € H3(D). Then we have with Green’s formula (7.12b) and the differential 
equations for w and v 


[lw =) (Au + knw) ae = f via ve ee ene 


D 


= # fa—neyae 


D 
that is, 
[wa + knw) dx = [eau+ ee) dx 


D D 


Furthermore, again by (7.12b), 
[ra@-¥) +P 6-0) ae = @, (7.51) 
D 
Combining these equations yields 
ic (Ad + k?np) dx = fe (Ad + k?¢) dx 
D D 


which proves the first direction. 


Now let v be of the form (7.48) and let there exist w € L?(D) such that (v, w) 
solves the eigenvalue problem (7.44a), (7.44b) in the sense of (7.46a). We extend 
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w to all of R® by setting w := v on R?\ D. Let w € C?(R3) with compact 
support. Then 
ic (Ad+ k?np)dx = ic (Ad + k?nw) dx + / v (Ap + k?y) dx 


R3 R3\D 


D 
fe (Ad + ky) dx + / v (Ad + k?a) dx 
D 


R3\D 


- [eau + ke) ae = 0 


Re 


by Green’s second formula. Here we have used (7.46a) for ¢ = y and the fact 
that v is a smooth solution of the Helmholtz equation. Therefore, w € L7,,.(R°) 
is an ultra-weak solution of Aw + k?nw = 0 in R?, compare with Lemma 7.10. 
The regularity result of that lemma yields w € H7,.(R°). 

The difference w — v vanishes in the exterior of D and obviously satisfies 
the radiation condition. Therefore, w € H7,.(R®) is the unique total field corre- 
sponding to the incident field v. The far field pattern w. of the corresponding 
scattered field w* = w—v vanishes. As in the previous part, we see that w isa 
superposition 


wa) =f ule:6) 96) A506 


of total fields. For the corresponding far field patterns we conclude that 


0 = wool) = f uso(é:6) (6) as(6) 
S2 
for all  € S?. This proves part (a). 


(b) The proof is very similar to the preceding one. Let g € L?($7) be a solution 
of (7.49) and define v as in (7.48). As in part (a), the integral is the far 
field pattern w.. corresponding to the total field w that satisfies the Helmholtz 
equation. Now w,, does not vanish but is equal to the function exp(—ikz-2). By 
Theorem 7.9, the only radiating solution of the Helmholtz equation with this far 
field pattern is the spherical wave exp(ik|x — z|)/|a— |. Because z is contained 
in D and the exterior of D is connected, the scattered waves w(x) — u(x) and 
exp(tk|a— z|)/|a—z| have to coincide outside of D. Now we modify the function 
x + exp(ik|a — z|)/|” — z| in a small ball B[z,e] C D centered at z such that 
the modified function—which we call F—is smooth in R® and coincides with 
exp(ik| - —z|)/|-—z| in R?\ D. Now we argue as in part (a). For ¢,y € C?(D) 
such that ¢—w has compact support in D we apply (7.12b) where we use that 
w—v—F € H3(D). This yields 


[wv F) Bus Rng) de = B [(Q-n)ovde - [e(Or+ nF) de, 
D 


D D 
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thus, using also (7.51) which is unchanged, 


jw (du + enw) dz = [vao+re) dx + [(pau-var) da. 


D D D 


Application of Green’s second theorem (classical because 0D, F', and w are 
sufficiently smooth) proves the first direction. 

The second direction is proved in the same way as in part (a). If v is of 
the form (7.48) and w € L?(D) such that (v,w) solves (7.46b) then we set 
w(x) := v(a) + exp(ik|a — z|)/|a — z| in the exterior, choose F' as above, and 
show that w is an ultra-weak solution of Aw + k?nw = 0 in R°. Therefore, 
w is the total field corresponding to the incident field v with scattered field 
exp(ik|x — z|)/|a —z|. Equation (7.49) follows since exp(—ikz- @) is the far field 
pattern. 


(c) Assume, on the contrary, that (7.49) is solvable for some g € L?(S*) and 
define v as in (7.48). Then, as in part (b), the spherical wave exp(ik|a—z|) /|a—z| 
coincides with v in the exterior of DU {z}. This leads to a contradiction as in 
the proof of Theorem 6.14 because v is bounded in z and the spherical wave is 
singular for x = z. Here we note that any Lipschitz domain satisfies the exterior 
cone condition (see Problem 6.6 for a two-dimensional example). 


As an application of Theorem 7.22, we give conditions under which the range 
of the far field operator F from (7.40) is dense in L?(S?). From (Fg, h) 12(82) = 


(9, F* h) 12(82) for all g,h € L?(S7), it is seen that the orthogonal complement 
of the range of F' is characterized by the null space of the adjoint F* of F. 


Theorem 7.23 The null space {h € L?(S?) : F*h = 0} consists exactly of 
those functions h € L?(S?) for which the corresponding Herglotz wave functions 
v(z) = | e**9h(—§) ds(g), 2 ER’, 

S2 


satisfy the interior transmission eigenvalue problem (7.44a), (7.44b) for some 
w € L?(D). 


Proof: By using the reciprocity principle (Theorem 7.18), we conclude that 


Fh=0 <— [xa h() ds(0) = 0 for all ¢ € S? 


S2 

<> | u(—%;—6) h(6)ds(6) = 0 forall #e€ S? 
S2 

= / tiga (#; 6) h(—6) ds(6) = 0 forall @ES?. 
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Application of Theorem 7.22 yields the assertion. 


By the previous theorem, F' is one-to-one and the range of F’ is dense in 
L?(S?) if k is not an interior transmission eigenvalue. We will investigate the 
interior transmission eigenvalue problem (7.44a), (7.44b) in more detail in Sec- 
tion 7.6 below. In particular, we will show that the set of eigenvalues is discrete 
and accumulates at infinity. 


The results (b) and (c) of Theorem 7.22 indicate that it should be possible 
to characterize the unknown set D by a criterion that depends on the solvabil- 
ity of the integral equation (7.49) of the first kind. A mathematically rigorous 
formulation of this idea leads to the linear sampling method. We note, however, 
that even for z € D the integral equation (7.49) is not always (even very rarely) 
solvable because of the additional requirement that in the solution (v,w) of 
(7.50a)—(7.50c) the part v has to be a Herglotz wave function. This observa- 
tion led to the development of the factorization method which we present in 
Section 7.5 below. 


7.4 Uniqueness of the Inverse Problem 


In this section, we want to determine if the knowledge of the far field pattern 
Uso (4#; 0) provides enough information to recover the index of refraction n = 
n(x). Therefore, let two functions n1,n2 € L®(R?) be given with ni(x) = 
no(“) = 1 for |x| > a. We assume that the corresponding far field patterns 
U1, and U2, coincide, and we wish to show that n; and ng also coincide. As 
a first simple case, we consider the Born approximation again. Let 


it eal 6) = Ub oo (83 8) for all # € S? and some 6 € S?. 


Formula (7.35) implies that my (k& — k0) = m3 (k& — k6) for all @ € S?. Here, 
m; := nj —1 for 7 = 1,2. Therefore, the Fourier transforms of m, and m2 
coincide on a sphere with center k@ and radius k > 0. This, however, is not 
enough to conclude that m, and m2 coincide. 

Let us now assume that 


ub (46) = ub,(#6) for all @ € S? and all 6 € S?. 


Then my (k&é — k0) = m3 (k& — k0) for all @,6 € S?. Therefore, the Fourier 
transforms coincide on the set {k(@— 6) : @,6 € S?}, which describes a ball 
in R? with center zero and radius 2k. The Fourier transforms of m, and m2 
are analytic functions, therefore the unique continuation principle for analytic 
functions yields that my and mj coincide on all of R® and thus my = mz. 
Therefore, the knowledge of {u2,(#;6) : #,4 € S?} is (theoretically) sufficient 
to recover the refractive index. . 

The same arguments also show that the knowledge of u2, (2; 0) for all @ € 9?, 
some 6 € S?, and all k from an interval of Ryo is sufficient to recover n. We 
refer to Problem 7.2 for an investigation of this case. 
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These arguments hold for the Born approximation to the far field pattern. 
We now prove an analogous uniqueness theorem for the actual far field pattern, 
which is due to A. Nachman [199], R. Novikov [211], and A. Ramm [220]. The 
proof consists of three steps, which we formulate as lemmata. For the first result, 
we consider a fixed refraction index n € L®(R?) with n(x) = 1 for |x| > a and 
show that the span of all total fields that correspond to scattering problems 
with plane incident fields is dense in the space of solutions of the Helmholtz 
equation in B(0, a). 


Lemma 7.24 Let n € L®(R*) with n(x) = 1 for |z| > a. Let u(-;0) denote 
the total field corresponding to the incident field e'*®*. Define the space H C 


L?(B(0,a)) by 


H := closurey2(5(0,a)) {v € H?(B(0,a)) : Av+k?nv =0 in B(0,a)}. 
(7.52) 
Then span{u(-;0 :6€ S?} is dense in H with respect to the L?—norm. 


Proof: By Lemma 7.10 the closed subspace H is contained in the closed space 
H := {u € L?(B(0,a)) : / v [Ap+k?ny] dx = 0 for all p € 113(B(0,«))} 
|z|<a 


of all ultra-weak solutions.* We show that span{u(-; 6 :0€ S?} is even dense 
in H. Let v € H such that 


(v, u(-;8)) ,» = / v(x) u(a;0)de = 0 for all 6€ S?, 


|z|<a 


where we write (-,-) 72 instead of (-, -) p2( (B(0.a)): The Lippmann-—Schwinger equa- 
tion (7.26) yields u(-;6) = (I+ T)~!u*(-;6) with ué(2;6) = exp(ika - 6); thus 


0 = (v, (I + T)'u'(-; 6)) 2 = ((I + Tw, ul(-; 6)) 2 
(w,u'(-; B)) os for all 6 € S? (7.53) 


where w := (I+ T*)~!v. Then w € L?(B(0,a)), and w satisfies the “adjoint 
equation” 


v(x) = w(x) + k? (1 — n(z) ) / P(x, y) w(y) dy, «x € Bi0,al; 


|z|<a 


that is, 


v(x) = w(x) + k?(1— n(2) ) / (x,y) w(y) dy, «€Bl0,a). (7.54) 


|z|<a 


4H and H even coincide but we do not need this. 
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Now set 
w(x) = i] w(y) O(a, y) dy for x €R®. 
ly|<a 
Then w is a volume potential with L?-density 7. We know from Theorem 7.11 


that w € H?,.(R°) satisfies Aw +k? = —w in R® almost everywhere. The far 
field pattern t.. of w vanishes by (7.53) because 


for all 6 € $2. Rellich’s lemma implies that w(x) = 0 for |x| > a; that is, 
w € Hé(B(0,a)) by Lemma 7.4. From (7.54) we conclude that 


0 = W+K(1—-n)d = —Ad—k?nw in B(0,a), 
and thus 
/ |u[?dx = jf vvae =—- / v [Aw —k?nw] dz = 0 
|z|<a |z|<a |z|<a 


since v € H and w € H3(B(0,a)). Therefore, v vanishes. 


The second lemma proves a certain “orthogonality relation” between solu- 
tions of the Helmholtz equation with different indices of refraction n; and no. 


Lemma 7.25 Let ni,n2 € L©(R?) be two indices of refraction with ny(x) = 
no(x) = 1 for all |x| > a and assume that u1,.0(#;8) = t2,00(#;0) for all 
&,0 € S?. Then 


/ v1 (x) v2(z) [ni(x) — no(ax)| dx = 0 (7.55) 


|z|<a 


for all solutions v; € H? (B(O, a)) of the Helmholtz equation Av; + k?njv; = 0, 
j = 1,2, in B(0,a). 


Proof: Let v; € H? (BO, a)) be any fixed solution of Av; + k?niv, = 0 in 
B(0,a). By the denseness result of Lemma 7.24 it is sufficient to prove the 
assertion for v2 := u2(-;4) and arbitrary 6 € $2. We set u = u1(-,4) — ua(-, 4) 
which is the same as the difference of the corresponding scattered fields. From 
U1,00(-, 9) = U2,00(-, 9) and Rellich’s Lemma 7.5, it follows that u € H?(R3) van- 
ishes outside of B(0,a); that is, u € Hj(B(0,a)) by Lemma 7.4. Furthermore, 
u satisfies the inhomogeneous Helmholtz equation 


Au + k?nyu = k*(ng — 11) ue(-,6) = k?(ng—11) v2 in R®. 
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We multiply this equation by v; and integrate over B(0, a). 


k? J (ram) or vade = fo [Au4 bry a) de = 0 


|t|<a |a|<a 


by Green’s formula (7.12b). 


The original proof of the third important “ingredient” of the uniqueness 
proof was first given in [258]. It is of independent interest and states that the 
set of all products vj v2 of functions v; that satisfy the Helmholtz equations 
Av; + k?n;v; = 0 in some bounded region (2 is dense in L?(Q). This is exactly 
the kind of argument we have used already for the uniqueness proof in the 
linearized problem of impedance tomography (see Theorem 6.9). The situation 
in this chapter is more complicated because we have to consider products of 
solutions of different differential equations with nonconstant coefficients. The 
idea is to construct solutions u of the Helmholtz equation Au + k?nu = 0 in 
B(0,a) that behave asymptotically as exp(z- x). Here we take n = n, or no. 
The following result is crucial. 


Theorem 7.26 Let again n(x) = 1 for |x| > a. Then there exist T > 0 and 
C > 0 such that for all z € C? with z-z= ae 23 =0 and |z| > T there exists 
a solution u, € H?(B(0,a)) of the differential equation 


Au, + k’nu, = 0 in B(0,a) (7.56) 


of the form 
u,(z) = e**(1+4,(z)), 2 € B(0,a). (7.57) 


where v, € H?(B(0,a)) satisfies the estimate 


is 


lvzl|z2(B00,0)) < | for allz €C? with z-z=Oand|z|>T. (7.58) 


z| 
Proof: The proof consists of two parts. First, we construct v, for the special 
case z = té, where é = (1,i,0)' € C® and t being sufficiently large. In the 
second part, we consider the general case by rotating the geometry. 

Let z = té for some t > 0. By scaling the functions as in the proof of 
Theorem 7.7, we can assume without loss of generality that B(0,a) is contained 
in the cube Q = [—7,7]® C R°. We substitute the ansatz 


u(x) = e* [1 + exp(—i/2a1) wi(z)] 


into the Helmholtz equation (7.56). This yields the following differential equa- 
tion for wy: 
Aw;:(x) + (2té —ip)-Vur(a) — (it + 1/4) wi(2) 
= —k*n(x) w(x) — k?n(x)exp(i/221) inQ, 
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where p = (1,0,0)' € R°. We refer to the proof of the unique continuation 
principle (Theorem 7.7) for the same kind of transformation. 

We determine a 27-periodic solution of this equation. Because this equa- 
tion has the form of (7.16) (for a = 1/4), we use the solution operator L, of 
Lemma 7.6 and write this equation in the form 


w, + RLi(nw:) = Lin inQ, (7.59) 
where we have set 7(2) = —k?n(x) exp(i/22,). For large values of t, the 
operator K,: w +> k?L;(nw) is a contraction mapping in L?(Q). This follows 
from the estimates 


k? 
|Kwwlle2q) = \|Li(nw)||z2(q) < = lrwllza@ 


k?\|nll oc 
< Elles tollzace) 


\ 


which implies that ||A;||c(x2(q)) < 1 for sufficiently large t > 0. For these 
values of t there exists a unique solution w; € H;.,.(Q) of (7.59). The solution 
depends continuously on the right-hand side, therefore we conclude that there 


exists c > 0 with 


_ ck? 
Ievelacay < elLeAlleqy < = nll 


for allt > T and some T > 0. This proves the theorem for the special choice 
z= té. 

Now let z € C3 be arbitrary with z-z = 0 and |z| > T. From this, we 
observe that |Rez| = |Imz| and (Rez) - (Imz) = 0. We decompose z in the 
unique form z = =e + ib) with 4,6 € S? and t > 0 and @-b = 0. We define 
the cross-product @ = @ x 6 and the orthogonal matrix R = [abe € R83, 
Then ¢ Ré = z and thus R'z = té. The substitution x ++ Rx transforms the 
Helmholtz equation (7.56) into 


Aw(x) + k’n(Rx) w(x) = 0, 2x € B(0,a), 


for w(x) = v(Rx), « € B(0,a). Application of the first part of this proof yields 
the existence of a solution w of this equation of the form 


w(x) = e@* [1 + exp(—i/2.21) wi(x)], 


where wy satisfies ||wz||z2(q) < C/t for t > T. From v(x) = w(R'z), we 
conclude that 


vi“) = et R's [1 + exp(—i/24- 2) w;(R'2)| 


= e**[1 + exp(—i/24-2) wu (2a) ; 


which proves the theorem also for the general case. 


Now we are able to prove the following analogy of Calderén’s approach 
(compare with the proof of Theorem 6.9). 
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Theorem 7.27 Let again ni,n2 € L™(Q) such that ni(x) = ne(x) = 1 for 
|x| >a. Then the span of the set 


P := {uiu2: uj; € H?(B(0,a)) solves Au; + k?nju; = 0} 


of products is dense in L'(B(0,a)). (We note that the product of two L?— 
functions in in L* by the theorem of Cauchy-Schwarz.) 

Proof: Since L°°(B(0,a)) is the dual space of L'(B(0,a)) we have to show 
that any g € L©(B(0,a)) with Siei<a U9 dx = 0 for all u € P has to vanish. 
Therefore, let g € L°°(B(0,a)) such that 


/ g(x) ur (x) ue(x) dx = 0 (7.60) 
|z|<a 
for all solutions u; € H?(B(0,a)) of the Helmholtz equation Au; + k?nju; = 0 
in B(0,a), 7 = 1,2. 
Fix an arbitrary vector y € R? \ {0} and a number p > 0. Choose a unit 
vector @ € R® and a vector b € R° with |b]? = |y|? + p? such that {y, @, b} forms 
an orthogonal system in R?. Set 


a A foe. ee ee 
2 = 5° 5 (y+ pa) and 27 := 5° 5 Y pa). 


Then z/-z/ = |Re z4|? —|Im 2)|?+2i Re 2 -Im 2 = |b|?/4— (|y|? +p?) /4 =0 
and |zJ|? = (|b|? + |y|? + p?)/4 > p?/4. Furthermore, z1 + 2? = —iy. 

Now we apply Theorem 7.26 with z’ to the Helmholtz equations Au; + 
k?njuj; = 0 in B(0,a). We substitute the forms (7.57) of uj into the orthogo- 
nality relation (7.60) and arrive at 


0 = / ete) [1 + v1(x)] [1 + v2(x)] g(x) dx 
|zi<a 
= i e 4" 11 + u(x) + v9(x) + v1(2) vo(x)] g(x) de. 
|a|<a 
By Theorem 7.26, there exist constants T > 0 and C > 0 with 
Weslle@oey Sy < 
JIL?(B(0,a)) > i] — p 


for all p > T. Now we use the Cauchy—Schwarz inequality and let p tend to 
infinity. This yields 
/ e '¥* g(x)dx = 0. 
|z|<a 


Because the vector y € R® \ {0} was arbitrary, we conclude that the Fourier 
transform of g (extended by zero into R*) vanishes. This yields g = 0. 


As a corollary, we have the following uniqueness theorem. 
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Theorem 7.28 Let ni,n2 € L©(R*) be two indices of refraction with n(x) = 
no(a%) = 1 for all |z| > a. Let ui.co and U2. be the corresponding far field 
patterns, and assume that they coincide; that 1s, U1,.0(4; 6) = U2,00 (4; 6) for all 
&,0 € S?. Then n, = no. 

Proof: | We combine the orthogonality relation of Lemma 7.25 with the dense- 
ness result of Theorem 7.27. This yields that n; — ng € L°® (B(O, a)) satisfies 
Siel<ca(™ —nz)hdx = 0 for all h € L'(B(0,a)). Therefore, nj — ng has to 
vanish. 


The proof of Theorem 7.27 does not work in R? because in that case there 
is no corresponding decomposition of y. However, using more complicated fam- 
ilies of solutions, uniqueness of the two-dimensional case has been shown by 
Bukhgeim in [28]. 


7.5 The Factorization Method 


It is the aim of this section to transfer the factorization method of the previ- 
ous chapter to the present scattering problem.*® Therefore, in this section we 
are only interested in determining the support of n — 1. We make the same 
kind of assumptions on this support as in the previous chapter (compare to 
Assumptions 6.10). 


Assumption 7.29 Let there exist finitely many Lipschitz domains Dj, j = 
1,...,M, such that Dj O Dy, = 0 for j # k and such that the complement 
R? \ D of the closure of the union D = OF D; is connected. Furthermore, let 


n € L®(R3) be real-valued such that there exists co > 0 with n = 1 on R? \ D 
andm=n—1> co on D. 


The far field operator F' from (7.40) plays the role of the difference A — A; 
of the Neumann-Dirichlet operators. The first ingredient of the factorization 
method is again the factorization of the data operator. To motivate the opera- 
tors that appear in the factorization we write the Helmholtz equation (7.10) in 
terms of the scattered field as 


Au’ + nus = P(1—n)ui = —k?mu'* in R?, (7.61) 


where we have again defined the contrast m by m = n —1. The source on the 
right-hand side is of a special form. We allow more general sources and consider 
radiating® solutions v € H7,,(IR°) of equations of the form 


Av + k’nv = —mf inR?® (7.62) 


5 Actually, it was a scattering problem for which the factorization method was first dis- 
covered ([157]) before it was applied to the problem of electrical impedance tomography in 
[23, 24]. 

Sthat is, satisfies the radiation condition (7.11) 
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for any f € L?(D). Here we extended m and f by zero into R°. This radiation 
problem has a unique solution for every f € L?(D). Indeed, we take again a > 0 
such that D C B(0,a) and consider the integral equation v + Tv = g with the 
integral operator T' from (7.27) and g € L?(B(0,a)) given by 


g(a) = / (n(y) — 1) B(e,9) Fo) dy, [el <a. 


ly|<a 


By Remark 7.14 this equation v + Tv = g has a unique solution. We rewrite 
the equation as 


v(a) = # / (n(y) — 1) B(a,y) w(y) dy + / (n(4) — 1) ®(e, 9) F@) ay, 


|yl|<a lul<a 


which shows (Theorem 7.11) that v € H7,,(IR3), and v is a radiating solution of 
Av+k?v=k?(1—n)v+ (1—n)f; that is, Av + k?nv = —mf. 


We define the operator G from L?(D) into L?(S?) by Gf = vs where 
v € H7,,(R%) is the radiating solution of (7.62). Then we can prove the following 
factorization of the far field operator F. 


meee 7.30 Again let G : L?(D) — L?(S?) be defined by Gf = vo, where 
H7,.(R®) is the radiating solution of (7.62). Then 


F = 4k? GS*G*, (7.63) 
where S* is the L?-adjoint of S : L?(D) — L?(D) defined by 


(Sw)(x) = tay x) — # [0) O(z,y)dy, «xeED. (7.64) 


We note that the integral in the definition of S i a volume potential with 
density 7 and can be extended to a function w € H7,,(R*) that radiates and is 
a solution of 


ie 


Aw + Pw = -y% inR*. (7.65) 


Proof of Theorem 7.30: From (7.61) and the definition of G we observe that 
Uso = k?Gu'. As an auxiliary operator we define H : L?(S?) + L?(D) by 


(Ha)(x) = f g(@)e**%as(d) = f g@)u'(e:6)ds(@), ce D, (7.66) 

Ss? S2 
where u*(-; 6) denotes the incident field of direction 6. By the superposition 
principle, Fg is the far field pattern corresponding to the incident field Hg; 


that is by (7.61), F = k?GH. Now we consider the adjoint H* of H which is 
given by 


(H"b)(4 y= [vo jetty, @E 82. 
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From the asymptotic behavior (7.20) of the fundamental solution ® we observe 
that H*w = 427 woo where We is the far field pattern of the volume potential 


w(2) = i by) (x,y) dy, eR. 
D 


By Theorem 7.11 the potential w € H7,.(IR®) satisfies (7.65); that is, 
2 1 2 
Awt+k*nw = -n (So —k w). 
m 
Using the definition of G this yields H*y = 4rwx = 4nG(b/m — k?w) = 


4nGSw; that is, H#* = 47GS and thus H = 47 S*G*. Substituting this into 
F = k?GH yields the assertion. 


Therefore, we arrived at a factorization of the far field operator in the form 
F = GTG* with T = 4rk?S*. It has the same form of (6.17) (with A* replaced 
by G) but there is an essential difference: In contrast to the operator T in the 
factorization (6.17) the operator T (i.e., S) fails to be self-adjoint. Otherwise, 
the operator F would be self-adjoint which is not the case. F' is only normal by 
Theorem 7.20. However, we can prove an analogous characterization of D by 
the range of G as in Theorem 6.14. 


Theorem 7.31 For any z € R® define the function ¢, € L?(S”) by 
DA) ae. ges". (7.67) 
Then z belongs to D if and only if ¢, belongs to the range R(G) of G. 


Proof: It is very similar to the proof of Theorem 6.14. 

First let z € D. Choose a ball B(z,e) = {x € R® : |x — z| < €} with center 
z and radius €« > 0 such that its closure B[z,<] C D. Furthermore, choose a 
function y € C®(R*) such that v(x) = 0 for |x — z| < ¢/2 and v(x) = 1 for 
|x — z| > € and set v(x) = 4ry(x) (a, z) for x € R°. Then v is a C®-function 
and coincides with 47®(-, z) outside of D. By (7.20) the far field pattern of v is 
given by ¢,. Therefore, 6, = Gf with f = —(Av+k?nv)/m in D which proves 
the first part. 

Now let z ¢ D and assume, on the contrary, that ¢, = Gf € R(G) for 
some f € L?(D). Let v € H7,,(R*) be the corresponding radiating solution 
of (7.62). Because ¢, is the far field pattern of 47®(-,z) and Gf is the far 
field pattern of v we conclude from Rellich’s lemma that 47®(-,z) = v in the 
exterior of DU {z}. Now one argues exactly as in the proof of Theorem 6.14 
or Theorem 7.22, part (c). If z ¢ D then v is smooth in z and 47®(-, z) has a 
singularity in z that leads to a contradiction. For z € 0D one again chooses a 
bounded piece Cp C R? of an open cone with vertex at z (see Assumption 6.10 
for the definition in R? with an obvious extension to R*) and CoN D = 0 and 
shows that (-,z) ¢ H?(Co). This contradicts v € H?(Co) and the fact that v 
and 47®(-, z) coincide. 
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The third step in the factorization method expresses the range of the oper- 
ator G by the known far field operator F’. First we again collect properties of 
the middle operator S of the factorization (7.63). 


Theorem 7.32 Again let S : L?(D) — L?(D) defined by (7.64) andm=n-1. 


(a) Let So be given by Sow) = w/m in D. Then So is bounded, self-adjoint, 
and coercive: 
1 
(Sob, d)n2~0) > 7 lIWllz2(p)_ for all b € L*(D). (7.68) 


Il] 00 


(b) The difference S — So is compact from L?(D) into itself. 
(c) S is an isomorphism from L?(D) onto itself. 


(d) Im(S¥,)r2(p) < 0 for all yw € L?(D). Also, if k? is not an interior 
transmission eigenvalue (see Definition 7.21) then Im(S¥,~)r2(p) < 0 
for all in the L?-closure of the range R(G*) of G* with #0. 


Proof: (a) This is obvious because Sow is just the multiplication of q by a 
function that is bounded below by 1/||m||,. and above by 1/cp. 

(b) We have already used (see the proof of Theorem 7.13 where this operator 
appears in the Lippmann-Schwinger equation) that the volume potential Sw — 
Sow defines a compact operator in L?(D). 


(c) By parts (a) and (b) and the Theorem of Riesz (Theorem A.36 of the 
Appendix) it is sufficient to prove injectivity of S. Let Sy = 0 in D. Set- 
ting y = W/m we conclude that 


oe Kf m(y) elu) ®-.y) dy =: tc D 
D 


Therefore, y solves the homogeneous Lippmann—Schwinger integral equation 
(7.26) and has thus to vanish by the uniqueness of the scattering problem. 
Therefore, ~ also vanishes. 


(d) Let w € L?(D) be arbitrary. Extend 7 and m by zero into R? and set 
f =v —k?mw in R® where w € H7,.(R®) is the volume potential with density 
w. Then Sw = f/m|p. Because w satisfies Aw + k?w = —a we observe that w 
satisfies also Aw + k?nw = —W + k*?mw = —f. Now we compute, by replacing 
w by f +k?mu, 


(Sv, P) rap) [ait ma] de = J ggltPae # ef pods 


D D D 
1 

fae dx — k* / [Aw + k?nw] wda (7.69) 
m 


D |a|<R 


I 
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where R > a and a > 0 is again chosen such that D is contained in B(0,a). 
Application of Green’s formula (7.13) yields 


/ [Aw + k’nw] wade = / [k?n|w|? — |Vw|*] wda + i wo as. 


|u|<R |a|<R |a|=R 


Substituting this into (7.69) and taking the imaginary part yields 


|z|=R 


Now we use the fact that |w(x)| decays as 1/|x| and, by the radiation condition 
(7.11), Ow/Or —ikw decays as 1/|a|?. Therefore, we can replace Ow/Ov by ikw 
in the last formula and have that 


Im(S¥,%)r2(p) = —k° / |w\?ds + O(1/R). 


|al=R 


Letting R tend to infinity yields by the definition (7.30) of the far field pattern 


Im(S¥, ~) 12(p) = =  |uolPds 
S2 


which is nonpositive. 

Now let 7 € closureR(G*) = .N(G)+ such that Im(Sw, v)r2(p) = 0. As we 
see from the previous equation, the far field pattern w. of the corresponding 
volume potential w vanishes. Rellich’s Lemma 7.5 and unique continuation 
yield that w vanishes outside of D, thus w € H?(D) by Lemma 7.4. Let 
¢@ € He(D). We extend ¢ by zero into the exterior of D. Then ¢ € H?(R®) and 
b := 1[AG¢ + k?nd] € N(G) by the definition of G because Ad + k?nd = md 
in R® with vanishing far field pattern. Now we set @ := % € L?(D) and 
v= wW—k?w. Then 


0 = (d)rap) = | vdde = | B[AG+knd| de 
| 


for all ¢ € H2(D). We show that (#, iw) € L?(D) x L?(D) satisfies the interior 
transmission eigenvalue problem (7.46a). Indeed, let 41,2 € H?(D) with ¢, — 
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é2 € H§(D). We take ¢ = $1 — ¢2 in the previous equation for w; that is, 


J @[der+ Pnoijde =f w[dde+ Mnda| de 
D 


D 


I 


fe Adz + k?ngz] dx + k* / w (Ado + k?nds] dx 


D 


Z / 5 [Ade + kta] dx + i? / w [Ado + k2¢2] de 
D 


D 


+ i [més [o + kh? w] dx 


lI 
cL 


0 [Ade + k? bo] da + @ [fw bz + w (Age + k*¢.)] dx 
D 


It remains to show that the last integral vanishes. Since w € Hj(D) we apply 
Green’s formula (7.12b) and have that 


ic [Ade + k?¢] d v= fol [Aw + k?w] dx = = [ obar. 

D D 
Therefore, (@, w) € L?(D) x L?(D 
problem (7.46a). By assumption 
in D and ends the proof. 


satisfies the interior transmission eigenvalue 


) 
nv = w =0 in D which implies that w~ vanishes 


Now we continue with the task of expressing the range of G by the known 
operator Ff. We make the assumption that & is not an interior transmission 
eigenvalue in the sense that (7.44a), (7.44b) is only solvable by the trivial 
solution. Then F is one-to-one by Theorem 7.22 and, furthermore, normal 
by Theorem 7.20 and certainly compact. Therefore, there exists a complete 
set of orthonormal eigenfunctions w; € L?(S*) with corresponding eigenvalues 
Aj € C, 7 = 1,2,3,... (see, e.g., [227]). Furthermore, because the operator 
I+ (ik)/(27) F is unitary (see again Theorem 7.20), the eigenvalues A, of F lie 
on the circle of radius 1/r and center i/r where r = k/(27). We can now argue 
exactly as in the corresponding case of impedance tomography. The spectral 
theorem for normal operators yields that F has the form 


Fb = DUA dy )ra(s2 dj), bE 17(S?). (7.70) 
j=l 
Therefore, F' has a second factorization in the form 


F = (F*F)/* R(F*F)4, (7.71) 
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where the self-adjoint operator (F*F)!/4 : L?($?) > L?($?) and the signum 
R: L?(S?) — L?(S?) of F are given by 


(F*F)V4y = So v/s (b, vs)z2(92) 7, WE L(S?), — (7.72) 


j=1 


& 
Il 


At (ab, bs)n¢s2) ty, BE LS). (7.73) 


Ry = 
rj 


iM: 


& 
Il 


Again, as in the case of impedance tomography (see (6.25)) we have thus derived 
two factorizations of F’, namely 


F = 4nk’GS*G* = (F*F)/4 R(F*F)4, (7.74) 


We now show that these factorizations of F imply again that the ranges of G 
and (F*F)'/4 coincide. Application of Theorem 7.31 provides then the desired 
characterization of D by F. 

The following functional analytic result is a slight extension of Lemma 6.15. 


Lemma 7.33 Let X and Y be Hilbert spaces and FF: X >~ X andG:Y — X be 
linear bounded operators such that the factorization F = GRG* holds for some 
linear and bounded operator R: Y + Y that satisfies a coercivity condition of 
the form: there exists c>0 with 


\(Ryy)y| > ellylly for ally e R(G*) CY. (7.75) 
Then, for anyde X, 640, 
~PER(G) = inf {|(Fz,2)x|: cE X, Gole = 1) SO: (7.76) 


We omit the proof because it follows exactly the same lines as the proof of 
Lemma 6.15 (see also [160]). 


We note again that the inf-condition depends only on F and not on the 

factorization itself. Therefore, we have the following corollary. 
Corollary 7.34 Let X, Y,, and Y2 be Hilbert spaces. Furthermore, let F : X > 
X have two factorizations of the form F = G, Ry G] = Gg Ro G5 with bounded 
operators G; : Y; + X and R; : Y; — Y;, which both satisfy the coercivity 
condition (7.75). Then the ranges of Gy and G2 coincide. 

In order to apply this corollary to the factorization (7.74) we have to prove 
that 9: L?(D) > L?(D) and R: L?(S?) > L?(S?) from (7.64) and (7.73), 
respectively, satisfy the coercivity conditions (7.75). The coercivity condition 
for S follows from Theorem 7.32. 


Lemma 7.35 Let k? be no interior transmission eigenvalue in the sense of 
Definition 7.21. Then there exists cy > 0 such that 


(Se. 9)22~D)| = aallellza~wy for all p € R(G*) Cc L*(D) (7.77) 
where again G* : L?(S*) — L?(D) is the adjoint of G. 
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Proof: | We assume, on the contrary, that there exists a sequence y; € R(G*) 
with ||~;||z2(p) = 1 and (Syj,p;)2(p) 0. The unit ball is weakly (sequen- 
tially) compact in L?(D). This is again a conclusion from the theorem of 
Alaoglu-Bourbaki (see Corollary A.78 of Appendix A.9). Therefore, there exists 
a weakly convergent subsequence of (y;). We denote this by y; — y and note 
that y belongs to the closure of the range of G*. Let So be the operator from 
Theorem 7.32. Then, 


(9 — 93, So(¥ — ¥3)) pac) (7.78) 
+ (95,53) L2(~D) — (%3,S~)L2(D) - 


From the compactness of S' — So we note that ||(S — S9)(~ — ;)||z2(p) tends to 
zero (see Theorem A.76, part (e)) and thus also (yj, (So — S)(y — %3)) rap) by 
the Cauchy—Schwarz inequality. Therefore, the first three terms on the right- 
hand side of (7.78) converge to zero, the last one to (y, Sy)12(p). Taking the 


imaginary part and noting that (y — 9;, So(y — ¥;)) is real-valued yields 


L?(D) 
y = 0 by part (d) of Theorem 7.32. Now we write, using the coercivity of So 


by part (a) of Theorem 7.32, 
1 


I|772|]o0 


< (pj, S0%5)22(D) S |(7,(So0 — S)¥s)z2~y| + |(Hj,S45)22~) 


2 


and the right-hand side tends to zero which is certainly a contradiction. 


Coercivity of the middle operator R : L?(S?) — L?($?) in the second fac- 
torization of (7.74) can be proven by using the fact that the scattering operator 
is unitary. Before doing this we prove a result of independent interest. 


Lemma 7.36 Let k? be no interior transmission eigenvalue in the sense of 
Definition 7.21 and let 4; © C, 7 EN, be the eigenvalues of the normal far field 
operator F. Then A; lie on the circle |2ni/k — z| = 2n/k with center 2ni/k and 
radius 27/k passing through the origin and converging to zero from the right; 
that is, ReA; > 0 for sufficiently large j. 


Proof: The fact that A, lie on the circle with center 27i/k passing through the 
origin follows from the unitarity of the scattering operator S = I + (ik)/(2a) F 
(see Theorem 7.20). We have only to show that the eigenvalues tend to zero 
from the right. Let ~; again be the normalized and orthogonal eigenfunctions 
of F' corresponding to the nonvanishing eigenvalues A;. From the factorization 
(7.63) it follows that 


An k?(S* G*y, G*be)r2(p) = (Fy, We) 22(82) = Ay 54,0 
with the Kronecker symbol 4;,¢ = 1 for 7 = ¢ and 0 otherwise. We set 


Qk v5 dj 
Qj = VE Gti and a= pg 
j 
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Then (yj, Se) 12(p) = $j6;,e- From the facts that ; lie on the circle with center 
2ni/k passing through the origin and that \,; tends to zero as j tends to infinity 
we conclude that the only accumulation points of the sequence (s;) can be +1 
or —1. The assertion of the theorem is proven once we have shown that +1 is 
the only possible accumulation point. Assume, on the contrary, that s; > —1 
for a subsequence. From Lemma 7.35 we observe that the sequence (;) is 
bounded in L?(D). Therefore, there exists a weakly convergent subsequence 
(see Corollary A.78 of Appendix A.9) that we denote by y; — y. Now we write 
exactly as in equation (7.78): 


(y — £7, S0(~ — %3)) nD) 


= (9,S0(% — 93) pap) — (¥3,(S0 - 8)(Y — ¥3)) 2p) 
+ (95,595) r2(p) — (%3, 5S) 12(D) - 
——— 


= $j 


The left-hand side is real-valued, the right-hand side tends to —1—(y, Sy) r2(p). 
Taking the imaginary part again shows that y has to vanish, thus as before 


0 < (~;,S0%;)22(D) = (5, (S0 — $)%3)n2(~D) + (53,505) L2(D) - 


The right-hand side converges to —1 which is impossible and ends the proof. 


Now we can easily prove coercivity of the operator R in (7.74). 


Lemma 7.37 Assume that k is not an interior transmission eigenvalue. Then 
there exists co > 0 with 


(Rv, v)r2(82)| = callvllz2(g2) for all b € L7(S?). (7.79) 


Proof: It is sufficient to prove (7.79) for ~ € L?($7) of the form p =D); cj; 
with ||¢|IZ2¢52) = Dy ley? = 1. With the abbreviation s; = j/|Aj| it is 


|(Rv, b)r2(s2)| = (Sow. Dew) | = 
j=l j=l L?(S?) 


co 


> ales? 


pA 


The complex number })5" , s;|cj|? belongs to the closure of the convex hull C = 


conv{s; : 7 € N} Cc C of the complex numbers s;, see (A.53) of Appendix A.8). 
We conclude that 
|(RYv, +) 12(82)| > inf{|z|:2€C} 


for all ~ € L?(S?) with ||2)||z2(92) = 1. From the previous lemma we know that 
the set C is contained in the part of the upper half-disk that is above the line 
(= {t8+(1—1t)1: t € R} passing through § and 1. Here, 8 is the point in 
{s; : j © N} with the smallest real part. (The reader should draw a picture.) 
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Therefore, the distance of the origin to this convex hull C is positive; that is, 
there exists cp with (7.79). 


By the range identity of Corollary 7.34 the ranges of G and (F*F)'/4 coin- 
cide. The combination of this result and Theorem 7.31 yields the main result 
of this section. 


Theorem 7.38 Assume that k? is not an interior transmission eigenvalue. For 
any z € R® again define ¢, € L*(S*) by (7.67); that is, 


$(&) := et, ge g?. 


Then 
zED => $,€R((F*F)4). (7.80) 


We want to rewrite this condition using Picard’s Theorem A.58 of 
Appendix A.6. Again let A; € C be the eigenvalues of the normal operator 
F with corresponding normalized eigenfunctions w; € L?(S*). Then we note 
that VADIE W;; Wj) is a singular system of (F*F)'/+. Therefore, Picard’s the- 
orem A.58 converts the condition ¢, € R((F*F)1/4) into a decay behavior of 
the expansion coefficients. 


Theorem 7.39 Under the assumptions of the previous theorem a point z € R? 
belongs to D if and only if the series 


oo aan 


converges. 


If we agree on the notation 1/oo = 0 and sign(t) = 1 for ¢ > 0 and sign(t) = 0 
for t= 0 then 


-1 
x(z) = sign  leetstsal , £eR', (7.82) 


is just the characteristic function of D. Formula (7.82) provides a simple and 
fast technique to visualize the object D. One simply plots the inverse of the 
series (7.81). In practice, this is a finite sum instead of a series, but the value 
of the finite sum is much larger for points z outside than for points inside D. 
We refer to the original paper [157] and to [160] for some typical plots. 

We conclude this section with some further remarks on the factorization 
method. The characteristic function x derived in the previous theorem depends 
only on the operator F. Nothing else about the scattering medium has to 
be known for plotting this function. In particular, it is not assumed that the 
support D of n—1 is connected; it can very well consist of several components. 
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Also, the function x can be plotted in every case where the scattering operator 
S is unitary (and thus F is normal). This is the case, for example, if the 
medium is perfectly soft or hard; that is, if a Dirichlet or Neumann boundary 
condition, respectively, on OD is imposed. The theoretical justification of the 
factorization method however (i.e., the proof that x is indeed the characteristic 
function of D), has to be given in every single case. For the Dirichlet and 
Neumann boundary condition and also for the impedance boundary condition 
Ou/Ov + Au = 0 on OD with a real-valued function A this can be shown (see 
[160]). This implies, in particular, a general uniqueness result. It is not possible 
that different “scattering supports” D give rise to the same far field operator 
F. There are, however, cases of unitary scattering operators for which the 
factorization method has not yet been justified. For example, if D consists of 
two components D; and Dz (separated from each other) and n > 1+ co on 
Dy, but n < 1—cp on Dg it is not known whether the factorization method is 
valid. The same open question arises for the case where the Dirichlet boundary 
condition is imposed on 0D, and the Neumann boundary condition on ODp. 
The main problem is the range identity; that is, the characterization of the 
range of G by the known operator F’. 

There also exist extensions of the factorization method for absorbing media. 
In these cases, the far field operator fails to be normal. Although some results 
exist on the existence of eigenvalues (see, e.g., [54]) the methods to construct 
the second factorization as in (7.74) fail. Instead, one considers factorizations 
of the self-adjoint operator Fy = |Re F|+ |ImF| where Re F = (F'+ F*)/2 
and Im F' = (F' — F*)/(2i) are the self-adjoint parts of F', and |A| of a self- 
adjoint operator A is defined by its spectral system. We refer to [160] for a 
comprehensive study of these cases. 


7.6 The Interior Transmission FKigenvalue Prob- 
lem 


As we have just seen, it is an important assumption for the factorization method 
to work that the wavenumber k is not an eigenvalue of the interior transmission 
eigenvalue problem (7.44a), (7.44b). This is one of the motivations to study this 
eigenvalue problem in more detail. We mention the monographs [34] and [55] 
with chapters on transmission eigenvalues and also the special issue [37] of the 
journal Inverse Problems which indicates the importance of this topic. First, 
we recall the Definition 7.21 for the convenience of the reader. 

The wave number k is an interior transmission eigenvalue if there exists no 
nontrivial pair (v,w) € L?(D) x L?(D) of fields that satisfies 


Av + k*v = 0inD, Aw + k’nw = 0in D, (7.83a) 


Ov Ow 
v = wondD, a py on OD. (7.83b) 


7.6 The Interior Transmission Eigenvalue Problem 283 


in the ultra-weak sense; that is, 


[rae + knw) wdr — [is +k?¢)vdx = 0 (7.84) 

D D 

for all ¢,y € H?(D) with ¢— 1 € H@(D). 
In the case when the index of refraction has a nonvanishing imaginary part 


(i.e., the medium is absorbing), there exist no eigenvalues. 


Theorem 7.40 [fImn(xz) > 0 on D and Imn(x) > 0 on some open set A C D 
then the eigenvalue problem (7.440), (7.44b) has no real eigenvalues k > 0. 
Proof: Let (v,w) € L?(D) x L?(D) be a solution of (7.44a) and (7.44b) 
corresponding to some eigenvalue k > 0. We choose any 7 € C%(R*) with 
compact support and set ¢ = w in (7.84). Then 


[elder Pula = [wlde+ Pau) ae; 


D D 

that is, 

few) ae+ yas =  [(n—1)weae. 

D D 
Set 

v-w nD, 
u={ 0 inR?\D. 

Then 


[ulde+ uae a  [(n- 1) wy daz 
RS D 

for all W € C%°(R?) with compact support. The regularity result of Lemma 7.10 
implies u € H?(R*) and 


Aut+k?u = k?(n—1)w_ almost everywhere in R? (7.85) 
where we have set the right hand side to zero outside of D. In particular, 
u € H@(D) by Lemma 7.4. Setting ~ = U and ¢ = 0 in (7.84) we conclude that 

[wlaa+ ena ae = 0. 
D 
Multiplication of (7.85) by @ and integration yields, using the previous formula, 
[uldu+ kuldx = [[enwt- k? wu] da 
D D 
= - fu [Au + ku] dx 
D 


= -W f (m—1) |w|?dar . 


D 
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Since u € H§(D) we can apply Green’s first formula (7.12a) which shows that 
the left hand side is real valued. Now we take the imaginary part and arrive at 


[in nul? de = 0. 
D 


Because Im n(x) > 0 for all x and Imn(a) > 0 on the open set A we conclude 
that w has to vanish in A. The general unique continuation principle (see the 
remark following Theorem 7.7) implies that w vanishes in all of D; that is, 
Au+ k?u = 0 in R® and thus u = 0 in D by the unique continuation principle. 
Therefore, also v vanishes in D, and k cannot be an eigenvalue. 


Therefore, in the following, we always assume that the refractive index n is 
real valued. We note that in the case where D is not penetrable but acoustically 
soft (i.e., w= 0 on OD) the corresponding eigenvalue problem (with respect to 
the justification of the factorization method) is just the classical eigenvalue 
problem for —A in D with respect to the Dirichlet boundary condition u = 0 
on 0D. Compared to this classical case the interior transmission eigenvalue 
problem is much less understood. Under certain assumptions on n we will show 
below in Subsection 7.7.3 that the spectrum is discrete and accumulates at most 
at infinity (see [52]), if eigenvalues exist at all. It took almost 20 years for the 
proof of existence of real eigenvalues (see [213, 35, 36, 33]). The reason for this 
gap is partially because the interior transmission eigenvalue problem is not self- 
adjoint. Looking back, it is surprising that it took so long to prove existence of 
real eigenvalues because the proof is rather elementary, and we will present it 
in Subsection 7.7.3. 

The fact that the transmission eigenvalue problem fails to be self-adjoint 
raises the question whether or not complex eigenvalues exist. In the general 
case the answer is totally open. For the case of D being a ball and n being 
radially symmetric a huge amount of work related to the existence and location 
of complex eigenvalues has appeared in the past decade. We will answer this 
question only for the special case of constant n in the following subsection (see 
Theorem 7.48). 

We refer to [64] for a survey prior to 2007 and to [158, 134] for interior 
transmission eigenvalue problems for other types of elliptic operators. Also 
we refer again to the forth edition of the monograph [55] where a chapter on 
transmission eigenvalues has been added. 

As just announced, we consider the special case of D being the unit ball and 
n being radially symmetric. 


7.6.1 The Radially Symmetric Case 


Let D be the unit ball and let n depend only on r = |a|. We assume that 
n € O70, 1] is positive on [0,1] and search for eigenfunctions v and w which are 
also radially symmetric; that is, depend only on r. The Helmholtz equations for 
v and w reduce to ordinary differential equations, see (7.8a). In Example 7.1 
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we proved the representation u(r) = asin(kr)/r, 0 <r <1, for somea eR 
and w(r) = w(r)/r where w(0) = 0 and w satisfies w’(r) + k?n(r)w(r) = 0 in 
(0,1). Therefore, w is a multiple of the function y,; that is, w(r) = B yx (r)/r 
for some ( € C, where yx satisfies the initial value problem 


yk (r) + k?n(r) yk (r) = 0,re (0, 1), yx (0) =0, A) =1. (7.86) 


Then w is regular at 0 and solves the equation Aw + k?nw = 0 in D. Further- 
more, (v, w) solves the eigenvalue problem (7.44a), (7.44b) if and only if a and 
GB satisfy the system 


asink -— By (1) = 0, 
akcosk — Gy,(1) = 0. 


This system has a non trivial solution if its determinant vanishes; that is, for 
those values of k € C which are zeros of 
d(k) := yi,(1) 


sink 
k 


— yp(1) cosk, kEC. (7.87) 


We call d the characteristic function of the radially symmetric transmission 
eigenvalue problem because it corresponds exactly to the characteristic function 
f = f(A) of the spectral problem studied in Chapter 5. 


In the following it will be important to study the asymptotic behavior of y,(1) 
and y;,(1) when |k| tends to infinity. We first use the Liouville transformation 
(see Section 5.1) 


s = s(r) = [ve dt, zp(s) := n(r(s))/* ye (r(s)) (7.88a) 
0 


again. Here, s ++ r(s) denotes the inverse function of the monotonic function 
r++ s(r). It transforms the differential equation (7.86) into the following form 
for Zp; 


zu(s) + (k? —q(s)) ze(s) = 0 for0<s<n, (7.88b) 
1 / 

q(s) := — ero n(r)~5/4 n (r) | (7.88c) 
: ( r=r(s) 


and 
n = s(1) = [ve dt . (7.88d) 
0 


The initial conditions y,(0) = 0 and y,(0) = 1 transform to z,(0) = 0 and 
z.(0) = n(0)~"/4, respectively. The quantity k? plays the role of \ of Section 5.1. 
Then z,(s) = n(0)~'/4u2(s,k?,q) where again ug = u2(s,k?,q) denotes the 
function of the fundamental system corresponding to (5.7b) with u2(0) = 0 and 
u3(0) = 1. With the asymptotic form of u2(7) and us(n) for |k| — oo (see 
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Theorem 5.5), we have after the back transform y,(r) = n(r)~1/4 ze (s(r)) = 
[n(r)n(0)]}-/4 ua(s(r), k?,@) 
1 sin(k7) 


= + O(exp(|Imk kl?) , : 
yr (1) monty (exp(|Im k|n)/|k|?),  (7-89a) 
/ n(a)]/* 
y(l) = En cos(kn) + O(exp(|Imk|7)/|kl) , (7.89b) 


where again 7) = re ,/n(s)ds. We substitute this into the definition (7.87) of 
d(k) which yields 


= 1 F(R) exp(| Im k|(7 + 1)) 
oo [n(0)n(1)] 4 xa AI? 


as |k| > co where 


(7.90) 


f(k) = n(1) sink cos(kn) — sin(kn) cos k 
Asin(k(1+7)) + Bsin(k(1—7)) 


with 
A= 5 (Vn) ~1) nid 5(Vn) +1). (7.91) 


We note that in the case n(1) = 1; that is, A = 0, the estimate (7.90) is useful 
only for real values of k because for |Imk| — oo the term f(k) is absorbed in 
O(exp(| Im k|(q + 1))/|k|?). 

If 7 4 1 we observe that the amplitude B of the second term of f(k) dominates 
the first term. Therefore, f has a zero in every interval a =F a *) 
for m € N and therefore also d for sufficiently large m. From this we conclude 
that the determinant vanishes at infinitely many discrete real values of k. If 
7 = 1 and n(1) #1 there is only the first term with A 4 0, and we can argue 
analogously with the extreme values of k ++ sin(k(1+7)). 

Therefore, we have shown: 


Theorem 7.41 Let n € C?[0,1] be positive and i \/n(t) dt £1 or n(1) #1. 
Then there exists an infinite number of real eigenvalues of (7.83a) and (7.83b), 
and these eigenvalues tend to infinity. 


We note that the corresponding eigenfunctions v are Herglotz functions of 
the form (7.48) because, by the following lemma, we have that 
oe sin(kr) _ a jee ds(g), 2 eR? 
S2 


r Ar 
Lemma 7.42 For x € R® we have that 
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Proof: Because the integrand is spherically symmetric, we can assume with- 
out loss of generality that « = p& for some p > 0 and & is the “north pole”, 
that is, = (0,0,1)'. Then 


27 1 


[ever asa) = (ee sin 6 dé dd 
0 0 
1 


S2 
— an f elds = 4n 
=i 


Before we consider the question of existence of complex eigenvalues k; that 
is, of complex zeros of the determinant d from (7.87), for constant values of n, 
we have to recall some results from the theory of entire functions. We follow 
the presentation in [55] (see also [56]) but simplify some of the proofs. 


sin p 


Definition 7.43 Let f : C > C be an entire function; that is, holomorphic in 
all of C. The order p of f is defined as 


In(1 e 
p= faye COE) ) 


roo lnr 


where || f||, = max{|f(z)|:|2| =r} for r > 0. 


oI 


Lemma 7.44 The characteristic function d from (7.87) is an even entire func- 
tion of order one provided n(1) £1 or n:= fo \/n(s)ds #1. 


Proof: d is an even function because y, and also k +» sink/k and cos are 
all functions of k?. Furthermore, from the Volterra integral equation (5.12b) 
for the part u2(-,A,q) of the fundamental system {u1, u2} of (5.7a), (5.7b) we 
derive easily that X +> wue(7,A,qg) is holomorphic in all of C and thus also 
kv yp (1) = n(1)-V424,(n) = [n(0)n(1)]-'/4u2(n, k?,q). The same holds for 
the derivative. Therefore, d is an entire function. 

It remains to compute the order of d. From (7.90) and the estimate | sin z| < 
el? we observe that |d(k)| < ce!*l+) for some c > 0. Therefore, for |k| =r, 

In(In |d(k)|) g In{In c+ (7 + 1)r] Z In[(7 + 2)r] In(7 + 2) 


Inr ~ Inr ~ lnr lnr 


for large values of r which proves that p < 1. Now we set k = tz for t € Ryo. 
If n(1) 4 1 then it is easy to see that |f(it)| > a e+) for large values of t 
(where f if given below (7.90)) and thus, again from (7.90), |d(it)| > ce’) 
for some c > 0. This yields that p > 1 and ends the proof if n(1) 4 1. If 
n(1) = 1 then n 4 1 and A =0 and one gets |d(it)| > cel"! and again p > 1. 


The following theorem is a special case of the more general factorization 
theorem of Hadamard which we cite without proof (see, e.g., [19], Chapter 2). 
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Theorem 7.45 (Hadamard) 

Let f be an entire function of order one, let m > 0 be the order’ of the root 
z=0 of f (in particular, m = 0 if f(0) £0), and let {a; : 7 € N} be the nonzero 
roots of f repeated according to multiplicity. Then there exists a polynomial p 
of degree at most one such that 


f(z) = zm II (1 - =) eu, 2et, (7.92) 
j=l 7 


With this theorem we can prove a slight generalization of a theorem of Laguerre. 


Theorem 7.46 Let f be an entire function of order one which is real for real 
values of z. Suppose that f has infinitely many real zeros and only a finite 
number of complex ones. Then f has a single critical point® on each interval 
formed by two consecutive real zeros of f provided this interval is sufficiently 
far away from the origin. 


Proof: Since f has only finitely many complex zeros (which will occur in conju- 
gate pairs) there exists a polynomial q whose roots are exactly all those complex 
ones of f as well as a possible root at z = 0 which has the same order as the 
possible zero of f at z =0. We factorize f(z) = q(z)g(z) where all the roots of 
g are given by {a; : 7 € N} in nondecreasing magnitude (where multiple roots 
are repeated according to their multiplicities). We apply Hadamard’s theorem 
to the function g and have for real values of x ¢ {a; : 7 € N} 


a _ * ingle) = p(z) + ae hn (1-=) +3] 
ot S[2otd] 


where a = p'(x) is a real constant (since p is a polynomial of order at most 
one). Differentiating this expression yields 


BGO) ele) eee) 
- £(55) Ser Bly et 


j=l 
Since q is a polynomial with no real zeros (except possibly z = 0) there exists 


c > 0 with | 4 (42)| < pap for all |a| > 1. Choose N ¢ N with N > 2c and 


"Note that the order of a root of a holomorphic function is always finite. 
8that is, a zero of f’ 
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then R > 0 with |a; — 2|? < 2|z/? for all |z| > R and j = 1,...,N. Then 


d (fa) c a tL 2 oe aa 


dx x ~ |x|? x—a;)2 — |al? Q\a|2 Q\ax|2 
j=1 d 


for |x| > R, « ¢ {a; : 7 € N}. Therefore, f’/f is strictly decreasing in every 
interval (ag, ae+1) with ag < ag} and @ large enough. Furthermore, from 


we conclude that lin £&=+00 and lim £@ = —oo. Therefore, t'/f 
t—ag+ f(a) > ae41— f(a) 


has exactly one zero in the interval (a, ae11) which ends the proof. 


As a corollary we obtain Laguerre’s theorem (see [19], Chapter 2). 


Corollary 7.47 (Laguerre) 


Let f be an entire function of order one which is real for real values of z and 
all of its zeros are real. Then all of the zeros of f’ are real as well and interlace 
those of f. 


Proof: In this case we set g(z) = z™” where m is the order of the zero z = 0. 
Then, by analytic extension, (7.93) holds also for complex z instead of x € R. 
If z € C is a critical point then from (7.93) 


Taking the imaginary part yields 


m = 1 
0 = -Imz—, - I ——— 
EP 1) Emap 


which implies that z is real. Now we follow exactly the proof of the preceding 
theorem. 


Now we are able to answer the question of existence of complex eigenvalues 
for the special case of n being constant. For constant n the function y, from 
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(7.86) is given by y,(r) = sake for r > 0 where 7 = \/n. Therefore, d(k) = 
a fn(k) where 


fn(k) = neos(kn)sink — cosksin(kn), keEC. (7.94) 


Here we indicated the dependence on 7 > 0. For the author the following result 
is rather surprising. 


Theorem 7.48 If = \/n £1 is an integer or the reciprocal of an integer then 
no complex eigenvalues with eigenfunctions depending only on r exist. Other- 
wise, there exist infinitely many complex eigenvalues. 


Proof: We consider the function f; from (7.94) and note first that —nf1/y(nk) = 
f,(k) for all k € C. Therefore, a complex zero of f, exists if, and only if, a 
complex zero of f;/, exists. It is thus sufficient to study the case 7 > 1. 

Let first 7 > 1 be an integer. Then f,(k) = 0 if, and only if, k is a critical 
point of the entire function g(k) = sintak) because g'(k) = f,(k)/sin? k. Since 
all of the zeros of g are real, by Corollary 7.47 also its critical points are real; 
that is, all of the zeros of f;, are real. 

Let now 7 > 1 not be an integer. We will construct a sequence J; of intervals 
which tend to infinity such that f, does not change its sign on Ig and each I, 
contains two consecutive real critical points of f,. By Theorem 7.46 this is only 
possible if there exist infinitely many complex zeros of fy. 

From 

f,(k) = (l- n°) sin(kn) sin k 
we observe that {= : j € N} and {jm : j € N} are the critical points of 
fn. Choose a sequence mg € N converging to infinity such that 7 me ¢ N for all 
CEN. Fix @ € N in the following and set m = m, for abbreviation. The interval 
(mn —1,mn) contains an integer 7. Set ¢ = mn — Jj € (0,1). The two points 
and m7 are consecutive zeros of te because their distance is ma — . = EF < 7 
Furthermore, 


(=) = ncos(j7) sin - cos = sin(j7) 
) n 1) 
= n(—1)/sin (ms — =) = n(—-1)7*1*™ sin =" and 
1) 1) 
f,(mmt) = ncos(mrn)sin(mm) — cos(mm) sin(mmn) 

(-1)™** sin(ja + em) = (—1)™11*) sin(er). 
We observe that the signs of f, () and f,(mm) coincide because ¢,¢/n € 
(0,1). Furthermore, f,, has no zero between 2" and mm because otherwise 


there would be another critical point between them. Therefore, the interval 


I= LS mr] has the desired properties. 


We refer to Problem 7.5 for two explicit examples where the assertions of 
this theorem are illustrated. 
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There exist also results on the existence of complex transmission eigenvalues 
for non-constant refractive indices under certain conditions. For example, as 
shown in [55] (see also [58]), if either 1 < \/n(1) < 9 or 7 < \/n(1) < 1 (where 
7 is again given by (7.88d)) there exist infinitely many real and infinitely many 
complex eigenvalues. Also, all complex eigenvalues lie in a strip around the 
real axis if n(1) # 1. For a more detailed investigation of the location for the 
constant case we refer to [257] and to [55] and the references therein. 


7.6.2 Discreteness And Existence in the General Case 


We continue now with the general case; that is, D is a bounded Lipschitz domain 
and n is real-valued with n(x) > 1+ on D for some qo > 0. For the definition 
of a Lipschitz domain we refer again to [191] or [161]. For Lipschitz domains we 
can give an alternative characterization of an interior transmission eigenvalue. 
We allow k 4 0 to be complex. 


Theorem 7.49 Let D be a Lipschitz domain. k € C \ {0} is an interior trans- 
mission eigenvalue if and only if there exist u € Hé(D) and v € L?(D), not 
vanishing simultaneously, such that 


Aut+k’nu = k(n—l1)v ae. inD and (7.95a) 


Av + k?v = 0 in D in the ultra weak sense; that is, 


fe [Ag+k*y] dx = 0 for ally € Hj(D). (7.95b) 
D 


Proof; Let first u € H3(D) and v € L?(D) with (7.95a), (7.95b). Set w= v—u 
and let ¢, 7) € H?(D) with ¢—w € Hé(D). Then 


fe [A¢ + k?¢] —w [Aw + k? nip] dx 
D 


= [eA@- 0 +e -w)] ae + [Hed —n)b-+u [aes ene de 


D D 


The first integral on the right hand side vanishes because v is an ultra weak 
solution of Au+k?v = 0. For the second integral we use Green’s second formula 
(7.12b) which yields 


[lav ene] dx = [edu ena dx = i fi (n—tjvde. 
D D D 


Therefore, the pair (v,w) € L?(D) x L?(D) satisfies (7.84). This proves the 
first direction. 
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For the reverse direction let v,w € L?(D) with (7.84). Define u = v — w in 
D and u = 0 in R® \ D. Furthermore, set y = ~ for some ~ € C®(R?) with 
compact support in (7.84). Then 


[ulae+ ene] dz = B [(n—1jovae: 


RS D 


that is, u is an ultra weak solution of Au + k?nu = k?(n — 1)u in R3. The 
regularity result of Lemma 7.10 yields u € H?(R?) with u = 0 in R?\ D; that is, 
u € H3(D) by Lemma 7.4 and Au+k?nu = k?(n—1)v almost everywhere in D. 
Finally, let ¢ € H@(D) and set 7 = 0 in (7.84). Then f,, v [Ay + ky] dx = 0. 
This ends the proof. 


This theorem makes it possible to eliminate the function v from the system. 
Indeed, let u € HZ(D) and v € L?(D) satisfy (7.95a) and (7.95b). We devide 
(7.95a) by n— 1 (note that n(x) — 1 > q on D by assumption), multiply the 
resulting equation by Aq + k?w for some w € H@(D) and integrate. This gives 


d. 
[du nal [Av + ky] ——— = e fv [Ad + kb] de = 0 
Ws 
D D 
by (7.95b); that is, replacing w by its complex conjugate, 
= d 
fice + k?nu] [Ad + k79)] “= 0 forall w € HG(D). (7.96) 


n—-1 


D 


On the other side, if u ¢ Hj(D) satisfies (7.96) then we set v = gaqy[Au + 
k?nu] € L?(D) which satisfies (7.95b). In this sense the system (7.95a), (7.95b) 
is equivalent to (7.96). Equation (7.96) is the weak form of the fourth order 
equation [A k?| — [Au k?nu| =0. 

With respect to (7.96) it is convenient to introduce a new inner product 
(-,-)a in Ho(D) by 


aa 
(u, Wa c= fade. u,v € H2(D). (7.97) 
D 


Lemma 7.50 (-,-), defines an inner product in HZ(D) with corresponding norm 
|| - ||x which is equivalent to the ordinary norm || - ||~2(p) in Hj (D). 


Proof: By the definition of H@(D) and a denseness argument it is sufficient to 
prove the existence of c,c2 > 0 with ci||Wl|. < ||Wlla2(p) < cell. for all 
wy € C§(D). The left estimate is obvious because n — 1 > qo > 0 on D. To 
prove the right estimate we choose a cube Q C R?® which contains D in its 
interior, extend any w € C§°(D) by zero into Q, and then periodically (with 
respect to Q) into R°. Without loss of generality we assume that Q = (—7,7)?. 
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Then p € H5.,(Q) and ||*b||z2(p) < cll ll 2,,(q) where c > 0 is independent of 
yw. With the Fourier coefficients 7; of ~ (see (7.15b)) we have 


We = SLM+ LPP by? 
jeZs 

= |bol? + S> +2191? + 1sldy? 

OAJEZ3 
: 4 

< |dol? + 4 So [al*lbyl? = Ibol? + Ons lAv lz) 
OAjEZS 

< |aol? + 4 We Meo p12 


It remains to estimate |yo|. For 2 = (7,0,0)' € OQ we conclude from the 
boundary condition that 5° ; vje"'™ =0 and thus 


1, 1 ; 
Iwol < So ld = Do gp Pal < AG So laltldi? < ellell. 
o¢jez3 jx x0 41 Vi 520 
as before. 
Now we rewrite (7.96) in the form 
(u,v) + ka(u,v) + k*b(u,v) = 0 for all » € HO(D), (7.98) 
where 
a(u,w) = [(owdd + Fay 
D 
Eee dx = 
2 [wad+ bau + puadar, 
ae 
D D 
— dz 
wud) = fue 
D 


for u,w € H§(D). The sesqui-linear forms a and b are hermetian. For a this 
is seen from the second form and Green’s second identity (7.12b). By the 
representation theorem of Riesz (Theorem A.23 of the Appendix) in the Hilbert 
space Hj(D) for every u € H§(D) there exists a unique u’ € Hé(D) with 
a(u,w#) = (u’,w)« for all » € HZ(D). We define the operator A from H%(D) 
into itself by Au = u’. Therefore, a(u,) = (Au,w). for all u,w € H§(D). 
Then it is not difficult to prove that A is linear, self-adjoint, and compact.” 


"For the proof of compactness one needs, however, the fact that Hi (D) is compactly 
imbedded in L?(D), see Problem 7.1. 
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Analogously, there exists a linear, self-adjoint, and compact operator B from 
H@(D) into itself with b(u, =) = (Bu, w). for all u,v € H@(D). Then (7.96) can 
be written as (u,w)« + k?(Au, wv). + k4(Bu,)+ =0 for all » € HZ(D)); that is, 


u + k*Au + k*Bu = 0 in H3(D). (7.99) 


This is a quadratic eigenvalue problem in the parameter tT = k?. We can reduce 
it to a linear eigenvalue problem for a compact operator. Indeed, since B is 
positive (that is, (Bu, u), > 0 for all u 4 0) there exists a positive and compact 
operator B'/? from H%(D) into itself with B!/?B!/? = B (see formula (A.47) 
of the Appendix). If u satisfies (7.99) for some k € C \ {0} then set uj = u and 
ug = k?B1/?u, Then the pair (“') € HZ(D) x H@(D) satisfies 


u2 


(Ht) + (fun Er) = (G). coo) 


Therefore 1/k? is an eigenvalue of the compact (but not self-adjoint) operator 
= — B1/2 

( a - Conversely, if (1) € Hg(D) x Ho(D) satisfies (7.100) 

then u = wu; satisfies (7.99). Therefore, well-known results on the spectrum of 

compact operators (see, e.g., [151]) imply the following theorem. 


Theorem 7.51 There exists at most a countable number of eigenvalues k € C 
with no accumulation point in C. 


By different methods the discreteness can be shown under the weaker assump- 
tion that n > 1 only in a neighborhood of the boundary OD (see, e.g., [256, 159] 
or [34]). 

The question of the existence of real eigenvalue was open for about 20 years.'?. 
The idea of the proof of the following result goes back to Paivarinta and 
Sylvester ((213]) and was generalized to general refractive indices by Cakoni, 
Gintides, and Haddar (([36]). 


Theorem 7.52 There exists a countable number of real eigenvalues k > 0 which 
converge to infinity. 


Proof: With the Riesz representations A and B of the hermetian forms a and 6, 
respectively, from above we define the family of operators ®(«) = 1+KA+ «7B 
for & > 0 with corresponding sesqui-linear forms 


(Kup) = fic + Knu] [Ad + Ky] a 


D 
= [aus ru [AD + x] x | u[ao+ Kplde; (7.101) 
D 


D 


10Note that the matrix operator in (7.100) fails to be self-adjoint! 
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that is, d(«;u,) = (®(K)u,~), for all u,~ € Hj(D). Here, « plays the role 
of k? and is considered as a parameter. We search for parameters k > 0 such 
that 0 is an eigenvalue of ®(«); that is, —1 is an eigenvalue of the compact and 
self-adjoint operator KA + K7B. 

For & > 0 let A;(«) € R, 7 € N, be the nonzero eigenvalues of the compact 
and self-adjoint operators kA + «2B. They converge to zero as j tends to 
infinity (if there exist infinitely many). The corresponding eigenspaces are finite- 
dimensional. Let the negative eigenvalues be sorted as Ay (K) < Ay (K) < +++ <0 
where the entries are repeated with its multiplicity. In general, they could be 
none or finitely many or infinitely many of them. 

Let m € N be any natural number. First we construct & > 0 and a subspace 
Vin of H§(D) of dimension m such that 6(&;u,u) < 0 for all uw € Vin: We choose 
€ > 0 and m pairwise disjoint balls B; = B(z;,¢), j = 1,...,m, of radius ¢ 
with UF B; C D. Setting no := 1+ qo we note that n(x) > no on D. In every 
ball B; we consider the interior transmission eigenvalue problem with constant 
refractive index no. By Theorem 7.41 infinitely many real and positive transmis- 
sion eigenvalues exist. Note that these eigenvalues do not depend on j because 
a translation of a domain results in the same interior transmission eigenvalues. 
Let k > 0 be the smallest one with corresponding eigenfunctions uj; € Hé(B;). 
Set & = k? and let $;(K; u,w) be the sesqui-linear form corresponding to no in 
B;. Then $;(&; u;,) = 0 for all ~ € H§(B;). We extend each u; by zero to a 
function in H}(D). (We use Lemma 7.4 again.) Then {u; : j = 1,...,m} are 
certainly linearly independent because their supports are pairwise disjoint. We 
define V,, = span{u; : 7 = 1,...,m} as a m—dimensional subspace of Hj(D) 


and compute for u= ee a;u; € Vm, using (7.101), 


O(@;u,u) = So layl?d(A; uz, us) 

j= 

= ajP| f au; + uj? - & f ws(sa5 + Rag] 
j= B; B; 

< S- Qj *| flan; au? a fw [a5 + A] de 
i= Bj Bj 

=> S- aj 2b; (Rj Uy, Uz) = 0. 
j= 


This shows that (®(&)u, uv), <0 for all w € Vi. By Corollary A.55 we conclude 
that there exist eigenvalues Ap(&) < —1 of KA+A?B for 0=1,...,m. Further- 
more, again by Corollary A.55 of Appendix A.6 the eigenvalues Ae(«) depend 
continuously on « and A¢(0) = 0 for all &. Therefore, for every @ € {1,...,m} 
there exists Ke € (0,4] with Ag(Ke) = —1. The corresponding eigenfunctions 
ug, €=1,...,m, satisfy ®(Ke)ug = 0 which implies that {\/Ke : = 1,...,m} 
are interior transmission eigenvalues. Since m was arbitrary the existence of 
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infinitely many real eigenvalues is shown. 


The—very natural—question of existence of complex transmission eigenval- 
ues in this general situation is totally open. Since even in the case of a constant 
refractive index in a ball both, existence and nonexistence, of complex eigen- 
values can occur (see, for example Theorem 7.48) the problem is certainly hard 
to solve. Aspects concerning the asymptotic distribution of the eigenvalues 
(Weyl’s formula), location and properties of the eigenfunctions have attracted 
a lot of attention, also for much more general types of elliptic problems. We 
only mention the special issue ([37]) of Inverse Problems in 2013, the already 
mentioned monographs [34, 55], and refer to the references therein. 


7.6.3. The Inverse Spectral Problem for the Radially Sym- 
metric Case 


After the study of the existence and discreteness of the interior transmission 
eigenvalues the natural question arises where these eigenvalues determine the 
refractive index n(a) uniquely. For spherically stratified media studied in Sub- 
section 7.6.1, this question is the analogue of the inverse Sturm-Liouville prob- 
lem of Chapter 5 and has been subject of intensive research. It started with 
the work by J. McLaughlin and P. Polyakov ([190]) and was picked up in, e.g., 
[56-58, 46, 45]. We will follow the presentations of [34, 55]. 

We assume as at the beginning of Subsection 7.6.1 that n € C?[0, 1] is pos- 
itive on [0,1]. This smoothness assumption is mainly necessary to apply the 
Liouville transform. We have seen in Subsection 7.6.1 that the interior trans- 
mission eigenvalue eigenvalues of (7.44a), (7.44b) are—for radially symmetric 
eigenfunctions—just the zeros of the characteristic function d, given by (7.87); 
that is, 

d(k) = yj,(1) mut — y,(1) cosk, kEC, (7.102) 
where y; solves the initial value problem 
U(r) + k?n(r)ye(r) =0 in (0,1), y(0) =0, y, (0) = 1. (7.103) 


Therefore, the inverse problem is to recover the refractive index n = n(r) from 
the zeros of the characteristic function d = d(k). This corresponds exactly to 
the situation of Chapter 5 where the inverse spectral problem was to recover 
q = q(x) from the zeros of the characteristic function f = f(A). Therefore, it is 
not surprising that we use similar arguments. 

We saw already (in Lemma 7.44) that the characteristic function d is an 
even entire function of order one provided n(1) 4 1 or n := i. Vn(s)ds #1. 
We prove a further property of d. 


Lemma 7.53 Let d be the characteristic function from (7.102). Then 


1 
lim ae = a fre) s’ds. 
0 


k30 k2 5 
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which is not zero if, for example, n(r) < 1 for allr € [0,1] andn #1. Therefore, 
under this condition k = 0 is a root of d of order 2. 

Proof: For low values of |k| it is convenient to use an integral equation argument 
directly for y,. Indeed, it is not difficult to show that yz € C?[0, 1] solves (7.103) 
if, and only if y, € C[0, 1] satisfies the Volterra equation 


yx(r) = r - i J (rs) n(s) uals) as, O<r<l. (7.104) 

0 
For sufficiently small |k| (actually, for all values of k) this fixed point equation 
is solved by the Neumann series (see Theorem A.31 of Appendix A). The first 


two terms yield 
Tr 


ye(r) = r — B iG —s)n(s)sds + O(|kI*). 
0 
Also, for the derivative we get 


y,(r) = 1 - Kf n(s)ye(s) ds =1- i? [ n(s) sas + O(|k|*). 
0 


Substituting these expansions for r = 1 and the power series of sink /k and cosk 
into the definition of d yields after collecting the terms with k? 


- [ ns) 24s + O(|k\*). 
0 


This ends the proof. 


From the properties derived in Lemmas 7.44 and 7.53 we have the follow- 
ing form of the characteristic function, provided i n(s)s?ds 4 1/3 holds and 
n(1) Alorn7 Fl. 


co k2 
d(k) = ye] (1- tr) keC, (7.105) 
j=l J 


for some 7 € C where k; € C are all nonzero roots of d repeated according to 
multiplicity. 

This follows directly from Theorem 7.45 and the fact that with k also —k isa 
zero. Indeed, let {kj : 7 € J} be the set of zeros in the half plane {z € C: Rez > 
0 or z= it, t > 0}. Then the disjoint union {k,; : 7 € J} U{—k, : 7 € J} cover 
all of the roots. We group the factors of Hadamard’s formula (7.5) into pairs 
(1 - me) eli hy (1 + -) e k/kj = 4 — e for 7 € J. Furthermore, the polynomial 


p(k) must be constant because d is even. This shows that 


5a ke 1 2th Re 
J oe J 


JET 
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After these preparations we turn to the inverse problem to determine 
n(r) from (all of) the roots k; € C of the characteristic function d. Our main 
concern is again the question of uniqueness, namely, do the refractive indices n1 
and nz have to coincide if the roots of the corresponding characteristic functions 
coincide? 

Therefore, we assume that we have two positive refractive indices n1,n2 € 
C?(0, 1] such that the zeros k; of their characteristic functions d; and d2 coincide. 
Furthermore, we assume that te ne(s) sds #4 1/3 for = 1,2 and also ng(1) £1 
for = 1,2. Then the characteristic functions de(k) have the representations 
(7.105) with constants ye for € = 1,2. Then we conclude that dy(k)/y1 = 
dz(k)/72 for all k € C; that is, we can determine d(k)/y from the data {k; : 7 € 
N 


In the following we proceed in several steps. 


Step A: We determine 7 from the data; that is, we show that 7, = 72 where 
Ne = ii \/ne(s) ds for € = 1,2 if the transmission eigenvalues corresponding to 
my, and nz coincide. Fix a € R and set k = a+it for t > 0. From the asymptotic 
behavior (7.90) we conclude that 


d(k) 1 : 
ae = 7 monty] [Asin(k(1+)) + Bsin(k(1 —7))] 
exp(t(7 + 1) 
+o (sets) 
7 A eia(nt1) et(n+1) [1+ O(1/t)], to, 


¥ [n(0)n(1)) /42% 


because 1 +7 > |1—1]. Here, A and B are given by (7.91) and A 4 0 because 
of n(1) £1. Analogously, we have for the complex conjugate k = a — it 


k d(k) = A 1/4 eia(at+l) ot(n+1) [1 + O(1/t)| ones 


Therefore, also the ratio is known and also 


kad(k) = —e70+") for allacR. 


eat 


This determines 7 through 2i(1 +7) = 7'(0). 


Step B: Under the assumption 7 4 1 we determine n(1) and yn(0)!/4 from 
the data. Let now k > 0 be real valued. Estimate (7.90) has the form 


k Ja i [Asin(k(1 + 7)) + Bsin(k(1 —))] + R(k) 


1 ——-y[n(o)n(1)]*/4 
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with |R(k)| < c1/k for k > 1 and some c; > 0. The left hand side is known and 
also 7 from Step A. Therefore, for fixed a > 0, also 


T 
w(T) = 7 [er sin(k(1 + 7)) dk 


T 
- are z_[ [Asin?(s( +m) sf 


a 


FT 
+Bsin(k(1—n)) sin((1 + ))] dk + 7 [Re sin(k(1 + )) dk 


and 


= : aU 1/4 7 | tAsin(xC +7)) sin(k(1—1)) + 


T 
+Bsin?(k(1—1))] dk+ 7 [Re sin(k(1 — )) dk 


are known for T > a. The terms involving R(k) tend to zero as T tends 
to infinity because a i |R(k)|dk < 3 i a nT /a) | The other elemen- 
tary integrals can be computed explicitly which yields that jim z ig sin(k(1 + 
oo 
- = : Seer) ais = 
n)) sin(k(1—7)) dk = 0 and jim + J, sin? (k(L+n)) dk = § (note that 4 1). 
Therefore, the limits jim, Wr (T) = 27[n(0)n(1)]1/4 and jm, ¥2(T)=sn@nayt 


are known and thus also 


vi(T) A Vv n(1) —1 


lim = 


To $(T) B  /n(ljt+1 


This determines n(1), thus also A and B and therefore also yn(0)!/4. 


From now on we assume that n(r) < 1 for all r € [0,1] and n(1) < 1. Then 
7 <1 and i, n(s) s2ds < 3. Therefore, all of the assumptions are satisfied for 


the determination of 7, n(1), and yn(0)!/4. 


Step C: Now we use the Liouville transform y,(r) = n(s(r))~!/4z,(s(r)) from 
(7.88a)-(7.88d) again where s(r) = fj \/n(s)ds and z(s) solves (7.88b) in 
(0,7) with z,(0) = 0 and 2/,(0) = n(0)~!/4. The application of Theorem 5.19 
in Example 5.20 yields an explicit form of n(0)'/4z,(r) in terms of the solution 
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K € C?(Ao) of the Gousat problem"! 


K,(s,t) .- Kiu(s, t) - q(s) K(s,t) = 0 in Ags (7.106a) 
K(s,0) = 0, O0<s<yn, (7.106b) 

1 s 
K(s,s) = 5 f a(e)ae, O<s<n, (7.106c) 

0 


where q(s) is related to n(r) by (7.88c) and Ag = {(s,t) € R?:0<t<s <n}. 
Indeed, equation (5.52) yields 


s(r) 


in(ks(r in(k 
nO = Tomer [+ fein af craon 
0 
for 0 <r <1 and thus 
1 sin(kn) sin(kt) 
ye(1) = 7 + | K(n,t) dt 
[nena] | & / 


Differentiation of (7.107) and setting r = 1 yields 


ie ny" _ __sin(kn ne ee [* ree q 


k 
0 
‘(1) S / ‘ 
n in( in( 
K(n t 
Tn) n(ys74 + ft a 
0 


Step D: We determine K(7,t) for t € [0,y]. With the constant y from 
Hadamard’s formula (7.105) we have for k = é7, £EN, 


br d(lr) _ br Yer (1) (ie 
Y y 
(-1)) 


oT 

= PEYO HARON sin(émn) + [xa ) sin(ért) dt 
) 

The left hand side and the factor in front of the bracket are known which 

implies that Jf,’ K(n,t) sin(¢nt) dt is determined from the data for all ¢ € N. 

This determines K(n,-) because {sin(émt) : £ € N} is complete in L?(0,7) since 

1 <1. Indeed, extend K(1,-) by zero into (0,1) and then to an odd function 


11The interval [0, 1] is now replaced by [0,7] which does not affect the result. 
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into (—1,1). Then [Vk n, t) sin(€nt) dt are the Fourier coefficients of this odd 
Saeusion which ee K(n,-) uniquely. 

Step E: Determination of n’(1). From the previous arguments y;,(1)/7 is known 
and thus also ~(k) := k an +k uo cosk = el) sink. We determine the 
np ney behavior of this expression as k € R tends to infinity. The integrals 


Jo’ K(n, t) sin(kt) dt and fj! 0K (n, t)/Ossin(kt) dt tend to zero as O(1/k) as seen 
from partial integration. Therefore, 


40) 4 OD cong = MO) 


wk) = k—— sink 
1/4 4 
= 7 n(0yi74 A ot +20 a ds| sink 
0 
n’'(1) sin(kn) . 2 
Fynlo)/*nist ke sink + O(1/k*). 


The left rats side is known and also the first term on the right hand side 
because f,’ q(s) ds = 2K(n,7). This determines n/(1) from the data. 


Step F: Determination of OK (n,t)/0s for t € [0,7]. We compute 7’ (é7) from 
the previously defined function ~ as 


n(1)1/4 
wer) = a(l cos(ér) = apn (oi) cos(emn) + 


e 
/ (Unt 
+/“ (,t) sin a a 
0 


n 
/ 
" aan ( 1° sin en) - [Ke sin(¢ a) dt 
0 


sin( lr 
sala) 1) i (7,7) 
T 


From this we conclude that also i OK(n,t)/Os sin(€xt) dt is known for all ¢ € N 
and thus also 0K (7, t)/Os by the same arguments as in Step D. 


Step G: Determination of g = q(s) for 0 < s <7. We recall from Steps D and 
F, that K(n,t) and OK(n,t)/Os are determined from the data for all t € [0,7]. 
More precisely, if Ky denote the solutions of (7.106a)—(7.106c) for g = q@, £ = 

2, then m1 = n2 =: n and Ky(n,-) = Ko(n,-) and 0K 1(n,-)/0s = 0K 2(n,-)/Os. 
The difference K := kK, — Ko satisfies 


Kss(8,t) — Kie(s,t) — ai(s) K(s,t) = [ai(s) — 92(s)] Ko(s,t) in Ao, 


Ss 


and K(-,0) = 0 on [0,7] and K(s,s) = $ {[qi(c) — q2(o)| do for 0 < s < n. 
0 


Furthermore, K(7,-) = 0K(n,-)/0s = 0 on (0, s Now we apply Theorem 5.18 
from Chapter 5 with gq= q, F = Ko, and f = g = 0. We observe that the pair 
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(K,@ — q2) solves the homogeneous system (5.43a)—(5.43d). The uniqueness 
result of this theorem yields that K =0 and q, = q2. This proves Part G. 


Step H: In this final part we have to determine n from q where their relation- 
ship is given by (7.88c). Let again n; and nz be two indices with the same 
transmission eigenvalues. Define ug by 


ue(s) = [ne(re(s))]", 8 € [0,72] = [0,7], 


where again rg = re(s) is the inverse of sg = se(r) = i Vne(o)do for = 
1,2. An elementary computation (using the chain rule and the derivative of 
the inverse function and (7.88c)) yields that ug satisfies the ordinary linear 
differential equation 


ug(s) = de(s)u(s), OSs<n, 

ne (1) 
4np(1)8/4 ° 
n5(1) and the uniqueness of this initial value 
= u2(s) for all s; that is, 


with end conditions up(n) = ne(1)'/4 and u}(n) = 
and 71(1) = ne(1) and nj (1) = 
problem we conclude that u1(s) 


si (ri(s)) = y/ma(r1(s)) = y/n2(r2(s)) = s),(r2(s)) . 


On the other hand, differentiating s¢(r¢(s)) = s yields s/(r¢(s))r)(s) = 1 which 
implies that r| = 714, thus r; = re and, finally, ny = ne. 


From qi = q2 


We summarize the result in the following theorem. 


Theorem 7.54 Let nj € C?(0, 1), j = 1,2, be positive with nj(r) < 1 on [0,1] 
and n;(1) < 1 such that all of the corresponding transmission eigenvalues with 
radially symmetric eigenfunctions coincide. Then n, and ng have to coincide. 


7.7 Numerical Methods 


In this section, we describe three types of numerical algorithms for the approx- 
imate solution of the inverse scattering problem for the determination of n and 
not only of the support D of n — 1. We assume—unless stated otherwise—that 
n € L©(R?) with n(x) = 1 outside some ball B = B(0,a) of radius a > 0. 

The numerical methods we describe now are all based on the Lippmann— 
Schwinger integral equation. We define the volume potential V@ with density 
@ by 


eikla—yl 


——— d(y)dy, «reEB. (7.108) 
An|x — y| 


VVeya) = f 


ly|<a 


Then the Lippmann-Schwinger equation (7.26) takes the form 


u— k’?V(mu) = u® in B, (7.109) 
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where again m = n—1 and u'(a,6) = exp (ikd -x). The far field pattern of 
u’ = k?V(mu) is given by 


tiga (2) = = [mw uly) e **4 dy, #€S?. (7.110) 


Defining the integral operator W : L®(B) — L*(S$?) by 


ke ea 
wa) = Efvometd, cs? (7.111) 
B 
we note that the inverse scattering problem is to solve the system of equations 


u— k’?V(mu) = u® inB, (7.112a) 
W(mu) = u® on 8”, (7.112b) 


for mand u. Here, u® denotes the measured far field pattern (in contrast to Ugo 
which is the true far field pattern). From the uniqueness results of Section 7.4, 
we expect that the far field patterns of more than one incident field have to 
be known. Therefore, from now on, we consider u' = u'(a, 6) - exp(ikd +) 
u=u(zx,6), and u® = u%°(z,8) to be functions of two variables. The operators 
V and W from equations (7.108) and (7.111) can be considered as linear and 
bounded operators 


V: L*(Bxs?) — L*(Bx 8”), (7.113a) 
W: L*(BxS*) —> L™(S? x $7). (7.113b) 


In the next sections, we discuss three methods for solving the inverse scattering 
problem, the first two of which are based on the system (7.112a), (7.112b). We 
formulate the algorithms and prove convergence results only for the setting in 
function spaces, although for the practical implementations these algorithms 
have to be discretized. The methods suggested by Gutman and Klibanov [112, 
113] and Kleinman and van den Berg [164] are iteration methods based on 
the system (7.112a), (7.112b). The first one is a regularized simplified Newton 
method, the second a modified gradient method. In Section 7.7.3, we describe 
a different method, which has been proposed by Colton and Monk in several 
papers (see [59]-[63]) and can be considered as an intermediate step towards 
the development of the linear sampling method (see [160]). 

The system (7.112a), (7.112b) describes a nonlinear equation for the pair 
(m,u). It can be shown that this equation is locally improperly posed in 
the sense of Definition 4.1. In principle, all of the methods of Chapter 4 as 
Tikhonov’s regularization or Landweber’s method can be applied. Convergence 
results are available once the assumptions such as the source condition or the 
tangential cone condition can be verified. For the inverse scattering problem 
discussed in this chapter T. Hohage and F’. Weidling (see [138]) were able to ver- 
ify the variational source condition for an index function of logarithmic type. 
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To the author’s knowledge, the validation of the tangential cone condition is 
still open. Therefore, the results of Section 4.2 on the Tikhonov regularization 
technique are applicable. 


7.7.1 <A Simplified Newton Method 


For simplicity of the presentation we assume for this section that n is continuous; 
that is, m € C(D) for some bounded domain D and n = 1 outside of D. By 
scaling the problem, we assume throughout this section that D is contained in 


the cube Q = [—7,7]° C R°. We define the nonlinear mapping 
T : C(Q) x C(Q x 8?) —+ C(Q x 8?) x C(S? x $7) (7.114a) 
by 
T(m,u) := (u—k?V(mu), W(mu) ) (7.114b) 


for m € C(Q) and u € C(Q x S”). Then the inverse problem can be written in 
the form 
T(m,u) = (ué,u®). 


The Newton method is to compute iterations (me, ue), €=0,1,2,... by 
(me41,Ue41) = (me,uc) — T’(me, ue)" [T(me, ue) — (u',u%)] (7.115) 


for €=0,1,2,.... The components of the mapping T are bilinear, thus it is not 
difficult to see that the Fréchet derivative T’(m, wu) of T at (m,w) is given by 


T'(m,u)(u,v) = (k?V (pu) +u—k?V(mv), W(uu) +W(mv)) (7.116) 


for  € C(Q) and v € C(Q x S?). 

The simplified Newton method is to replace T’(me, ue) by some fixed T’ (7n, t) 
(see Theorem A.65 of Appendix A). Then it is known that under certain assump- 
tions linear convergence can be expected. We choose 7 = 0 and @ = u’. Then 
the simplified Newton method sets mg41 = me + pw and ug41 = ue + v, where 
(u,v) solves T’(0, u')(u,v) = (u’,u) — T(me, ue). Using the characterization 
of T’, we are led to the following algorithm. 


(A) Set mo = 0, up = u*, and £=0. 
(B) Determine (u,v) € C(Q) x C(Q x S?) from the system of equations 


kV(put) — v = ub — ue + kV (meus), (7.117a) 
W(pu') = uu? — W(meur). (7.117b) 


(C) Set mer. = me + pw and ug, = ug + v, replace & by +1, and continue 
with step (B). 


7.7 Numerical Methods 305 


We assume in the following that the given far field pattern u° is continuous 
(that is, uw € C(S? x S?)). Solving an equation of the form W(ju*) = p means 
solving the integral equation of the first kind, 


Lok « 4 7 . 
io ebky(O-2) dy = _ o(@,0), #,0¢ 87. (7.118) 
Q 


We approximately solve this equation by a special collocation method. We 
observe that the left-hand side is essentially the Fourier transform pw™~ of pu 
evaluated at € = k(@— 6). As in Gutman and Klibanov [113], we define N € N 
to be the largest integer not exceeding 2k/,/3, the set 


Zy = {j €Z?:|j,)<N, ¢=1,2,3} 


of grid points, and the finite-dimensional space 


= { ST ajetF? a5 € ch. (7.119) 


jJEZN 


Then, for every j € Zy, there exist unit vectors 2;,0; € S? with j = k(4; —6;) 
(note that |j|/k < 2). This is easily seen from the fact that the intersection of 
S? with the sphere of radius 1 and center j/k is not empty. For every j € Zn, 
we fix the unit vectors 7; and 6; such that 7 = k (8; — 6; AD 
We solve (7.118) approximately by substituting @; and 6; into this equation. 
This yields 
yas 4 a oe , 
bye 4% dy = — Fy (45,95) JEZny. (7.120) 
Q 


The left-hand sides are just the first Fourier coefficients of yu, therefore the 
unique solution of (7.120) in Xy is given by «4 = Lip, where the operator 
[1 : C(S? x S?) + Xn is defined by 


(Lip\(2) = ~s-2y Yo. o(8,6)) 7. (7.121) 
JEZN 
The regularized algorithm now takes the form 
(A,) Set mo =0, up = u’, and £=0. 
(B,) Set 


Bw := Ly[u°—W(meug)] and 
vy i= ub —up—k?V (meu) — k?V (pu). 


(C,) Set mer, = me + p and ugy1 = ue + v, replace ¢ by +1, and continue 
with step (B,.). 
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Then we can prove the following (see [113]). 


Theorem 7.55 There exists ¢ > 0 such that, if m € C(Q) with |\mlloo < 
€ and u = u(a,0) is the corresponding total field with exact far field pattern 
u™ (4,0) = Uoo(%,A), then the sequence (me, uc) constructed by the regularized 
algorithm (A,), (B,), (C,) converges to some (m,%) € Xw x C(Q x S?) that 
satisfies the scattering problem with refraction contrast m. Its far field pattern 
Ux coincides with Ug at the points (83, 0;) € S? x $7, 7 € Zy. If, in addition, 
the exact solution m satisfies m € Xn, then the sequence (me, ue) converges to 
(m, wu). 
Proof: We define the operator 

L:C(Q x 8?) x C(S? x 8S?) — Xv x C(Q x S$?) 
by 

L(w,p) = (Lip, w—k?V(u'Lip) ). 
Then L is a left inverse of T’(0,u") on Xy x C(Q x S?); that is, 
LT'(0,u')(u,v) = (u,v) for all (u,v) € Xw x C(Q x S$”). 


Indeed, let (u,v) € Xw x C(Q x S?) and set (w,p) = T’(0,u")(y,v), ie., 
w=vt+kV( and p = W(u"). The latter equation implies that 


pu") 
has An KF ; 
fume ogy es — Fa (45,95), j € Zn. 
Q 


Because p € Xy, this yields p = Lip and thus L(w, p) = (u,v). 
With the abbreviations z, = (me, uc) and R = (u',u~), we can write the 
regularized algorithm in the form 


zen. = ze — L[T(z)-—R], €=0,1,2,... 


in the space Xy x C (Q x 7). We can now apply a general result about local 
convergence of the simplified Newton method (see Appendix A, Theorem A.65). 
This yields the existence of a unique solution (m,i) € Xy x C(Q x S$?) of 
L |T(m,%) — R] = 0 and linear convergence of the sequence (me, ug) to (Mm, i). 
The equation w+ k?V (mit) = u' is equivalent to the scattering problem by 
Theorem 7.12. The equation L;W(mi) = Lif is equivalent to t.(#;,0;) = 
u~ (%;,6;) for all j € Zy. Finally, ifm € Xy, then (m,u) satisfies L T(m,u) = 
LR and thus (7m, %) = (m,u). This proves the assertion. 


We have formulated the algorithm with respect to the Lippmann—Schwinger 
integral equation because our analysis of existence and continuous dependence 
is based on this setting. There is an alternative way to formulate the simplified 
Newton method in terms of the original scattering problems; see [113]. We 
note also that our analysis can easily be modified to treat the case where only 
n € L™(B). For numerical examples, we refer to [113]. 
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7.7.2 A Modified Gradient Method 


The idea of the numerical method proposed and numerically tested by Kleinman 
and van den Berg (see [164]) is to solve (7.112a), (7.112b) by a gradient-type 
method. For simplicity, we describe the method again in the function space 
setting and refer for discretization aspects to the original literature [164]. Again 
let B = B(0,a) contain the support of m= 1—n. 


(A) Choose mp € L®(B), uo € L?(B x $7), and set ¢ = 0. 
(B) Choose directions eg € L?(B x S$?) and dy € L®(B), and set 
Ulett = Utages, Mey = mMet+ Cede. (7.122) 


The stepsizes a, 3¢ > 0 are chosen in such a way that they minimize the 
functional 


lresillizcaxs2) , WISerallZa(s2x 2) 
U,(a, 8) = ui? + 5 : (7.123a) 
u#ll2 205x582) II flli2(s2x 92) 


where the defects rg41 and sz; 1 are defined by 


ro. c= uw -— wy - k°V (mess Ue+1) » (7.123b) 


u~ — W(mey1 Ue41) - (7.123c) 


Se41 


(C) Replace @ by €+ 1 and continue with step (B). 


There are different choices for the directions dy and eg. In [164], 


die = — | de(v,6) usw, 8) 4506), eR, and esa, (7194) 
S2 


have been chosen where 
dj = —W* (W (meue) = in) € L* (B x S$?) : 


In this case, d¢ is the steepest descent direction of fz > ||W(pue)—u% II7.2¢92 x52)" 
In [266], for dg and eg Polak—Ribiére conjugate gradient directions are chosen. 
A rigorous convergence analysis of either method has not been carried out. 


A severe drawback of the methods discussed in Sections 7.7.1 and 7.7.2 is 
that they iterate on functions me = me(x) and ug = ue(x,). To estimate the 
storage requirements, we choose a grid of order VN: N-N grid points in B and M 
directions 01,...,@;¢ € S*. Then both methods iterate on vectors of dimension 
N®.M. From the uniqueness results, M is expected to be large, say, of order 
N?. For large values of M, the method described next has proven to be more 
efficient. 
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7.7.3 The Dual Space Method 


The method described here is due to Colton and Monk [61, 62] based on their 
earlier work for inverse obstacle scattering problems (see [59, 60]). There exist 
various modifications of this method, but we restrict ourselves to the simplest 
case. 

This method consists of two steps. In the first step, one tries to determine 
a superposition of the incident fields u’ = u*(-,@) such that the corresponding 
far field pattern uco(-,6) is (close to) the far field pattern of radiating multi- 
poles. In the second step, the function m = n—1 is determined from an interior 
transmission problem. 

We describe both steps separately. Assume for the following that the origin 
is contained in B = B(0,a). By u® we denote again the measured far field 
pattern. 


Step 1: Determine g € L?(S”) with 
if u~(&, 6) (6) ds() = 1, #€S?. (7.125) 
S2 


In Theorem 7.22, we have proven that for the exact far field pattern wu. (2, 4) 
this integral equation of the first kind is solvable in L?($?) if and only if the 
interior transmission problem 


Av+k?v =0inB, Aw + k’nw = 0 in B, (7.126a) 
ik|z| 
w(x) — v(x“) = “ | on OB, (7.126b) 
x 
Ow(a) du(z) sO eta | 
a oe =e er ee (7.126c) 


has a solution v, w € L?(B) in the ultra weak sense of Definition 7.21 such that 


v(“) = pet?) ds(j), 2eER?*. (7.127) 


S2 


The kernel of the integral operator in (7.125) is (for the exact far field pattern) 
analytic with respect to both variables, thus (7.125) represents a severely ill- 
posed—but linear—equation and can be treated by Tikhonov’s regularization 
method as described in Chapter 2 in detail. (In this connection, see the remark 
following Theorem 7.23.) 

We formulate the interior transmission problem (7.126a)—(7.126c) as an inte- 
gral equation. 


Lemma 7.56 = (a)Letv,w € L?(B) solve the boundary value problem (7.126a)- 
(7.126c) in the ultra weak sense of (7.46b). Define u= w—v in B and 
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u(x) = exp(ik|x|)/|2| in R°\B. Then u € H7,,(R?) and u and w € L*(B) 
solve 


u(x) = k? 7 m(y) w(y) ®(z,y) dy, «ER, (7.128a) 


lyl<a 


where againm =n—1. Furthermore, v satisfies Av + k?v = 0 in B in 
the ultra weak sense; that is, 


[i +k’¢)udx = 0 forall¢¢ H(B). (7.128b) 
B 


(b) Let u € H?,,(R°) and w € L?(B) solve (7.128a) and u(x) = exp(ik|zx|) /|2| 
in R?\ B. Furthermore, let v := w—u € L?(B) be an ultra weak solution 
of Av+k?v =0 in B in the sense of (7.128b). Then v,w € L?(B) solve 
the boundary value problem (7.126a)—(7.126c) in the ultra weak sense of 
(7.46b). 

Proof: (a) Let v,w € L?(B) solve the boundary value problem (7.126a)- 
(7.126c) in the ultra weak sense. Equation (7.128b) follows immediately from 
(7.46b) by choosing ¢ € H2(B) and 7 = 0 in B. 

Let now w € C?(R?) with compact support and set ¢ = w in R®. Substitut- 
ing this into (7.46b) yields 


[Au$ ie) (wo) ae + ef miewar 
B B 


- anf [o¢.0) Fev | a 


2. lye / ®(-,0) (Aw + ky) dx 


R3\B 


for all ~ € C?(R°) with compact support. Here we have used Green’s theorem 
in the exterior of B (note that 7 has compact support). Using the definition of 
u in B and in the exterior of B yields 


[aut Pejuds = =k? | mibw de 
B 


R3 


for all yw € C?(R*) with compact support. The regularity result of Lemma 7.10 
yields u € H?,.(R?) and Au + k?u = —k?mw in R3. This equation is uniquely 
solved by the volume potential with density k?mw (see Theorem 7.11). This 
proves the first part. 

(b) Since wu is the volume potential we conclude from Theorem 7.11 that u is 
the radiating solution of Au + k?u = —k?mw in R°. Let ¢,y € C?(R) with 
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compact support such that ¢ = w in the exterior of B. Then, since w= vu+u 
in B, 


[Au + inp) wae - [(Ae+ ko) ode 


B B 


[auto tu) ae + mf mower - [(Ae+ #0) var 
B B B 


[AW -6) + BU) vde + 2 [ miwae 
B B 


+ [aut Pejuds - jf Av+ Beyuae, 


R3 R3\B 


The first integral on the right hand side vanishes because v is an ultra weak 
solution of Av + k?v = 0 in B and w— ¢ € H@(B). The sum of the second and 
the third integral vanishes as well after application of Green’s second formula 
(7.12b) and Au + k?u = —k?mw in R?. For the last integral we use that 
u(x) = exp(ik|a|)/|a| in R° \ B and apply Green’s theorem in the exterior of B 
which yields 


w(x) eke! fa) e**la| 


8 
Ov |a| ”) Op ea 


| (Qv+ eyuae = /| 


R3\B dB 


which ends the proof. 


Motivated by this characterization, we describe the second step. 


Step 2: With the (approximate) solution g € L?(S”) of (7.125), define the 
function v = v, by (7.127). Determine m and w such that m, vg, and w 
solve the interior boundary value problem (7.126a)—(7.126c) or, equivalently, 
the system 


w— vg — k°V(mw) = O inB, (7.129a) 
k°V(mw) — 47 (-,0) = 0 on OB, (7.129b) 


where V again denotes the volume potential operator (7.108) and ® the funda- 
mental solution (7.19). Here we used the trace theorem for H?—functions and 
the fact that k?V(mw) = 47 ®(-,0) on OB is equivalent to k?V (mw) = 47 ®(-, 0) 
in the exterior of B by the uniqueness of the exterior Dirichlet problem. 

Instead of solving both steps separately, we can combine them and solve 
the following optimization problem. Given a compact subset C Cc L°(B), some 
e>0 and Ai, A2 > 0, 


minimize J(g,w,m) on L?($7) x L?(B) x C, (7.130a) 


7.7 Numerical Methods 311 


where 


J(g,w,m) == ||Fg—1|lZ2¢s2) + ellgliZecsay (7.130b) 


+ Ai ||w — vg — kV (mw) ||72 (8) 


+ Aq||k?V (mw) — 42 O(-,0)||Z2¢apy » 
and the far field operator F : L?($7) + L?(7) is defined by (see (7.40)) 
(Fg)(#) := / u(#,6) 9(8) ds(6), #82. 
S2 


Theorem 7.57 This optimization problem (7.130a), (7.130b) has an optimal 


solution (g,w,m) for every choice of €,A1,A2 > 0 and every compact subset 
Cc L™*(B). 


Proof: — Let (g;,wj;,m,;) € L?(S?) x L?(B) xC be a minimizing sequence; that 
is, J(g;,w;,m,;) + J* where the optimal value J* is defined by 


J* := inf{J(g,w,m) : (g,w,m) € L?(S?) x L?(B) x C}. 


We can assume that (m,) converges to some m € C because C is compact. 
Several tedious applications of the parallelogram equality 


Ja +b? = —lla—d|/? + 2Nlal|? + 2|[b|l? 
and the binomial formula 
[|]? = lla]? + 2Re(a,b—a) + |la—6))? 


yield 
: 1 1 
“IP 2B FT| 5 (95 + ge), 5 (wy + we), ms 


1 1 
= — 5 I(Gj,~wj,mj) — 5 I(Ge,we,m5) 


1 E 
+7 IF (95 — ge)lIZ2¢92) + qI|95 — gelli2(s2) 
At 
q |Ileej — we) — Yg;—ge k?V (m;(w; — we))\IZ2(8) 
Nok* 
arg I|V (m,(w; — we)) IiZ2(oBy - 
From this we conclude that 


P 1 1 
E 2 At 2 2 
2 Ziloi — gellza¢s2) + | IIs — we) — Ugj-ge — F V (mj (w; — we))\IZ2cB) - 
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The left-hand side tends to zero as j and £ tend to infinity, therefore we conclude 
that (g;) is a Cauchy sequence, thus converging g; + g in L*(S*). Furthermore, 
from 
[Iw — we) — Ug;-ge — k°V(m;(w; — we))||z2~B) — 0 
as £,j — oo and the convergence g; — g we conclude that 
I| (w,; = wre) = k?V (mj(w; = we))||2(B) — 0 


as l,j — oo. The operators I — k?V(m;-) converge to the isomorphism I — 
k?V(m-) in the operator norm of L?(B). Therefore, by Theorem A.37 of 
Appendix A, we conclude that (w,;) is a Cauchy sequence and thus is con- 
vergent in L?(B) to some w. The continuity of J implies that J(g;,w;,mj) > 
J(g,w,m). Therefore, (g,w,m) is optimal. 


7.8 Problems 


7.1 Let Q = (—7,7)° C R® be the cube and H?(Q), H}(Q), and H?.,.(Q) be 
the Sobolev spaces defined at the beginning of this chapter. Show that 
HP.,.(Q) C H?(Q) and H§(Q) C HP.,(Q) with bounded inclusions. Use 
this result to show that H}(Q) is compactly imbedded in L?(Q). 


7.2 Let ub .(&, 6, k) and ub .(@, 0, k) be the far field patterns of the Born 


approximations corresponding to observation &, angle of incidence 6, wave 
number k, and indices of refraction n,; and ng, respectively. Assume that 


Ui o(#,,k) = UB,00(4, 4, k) 
for all ¢ € S? and k € [k1, k2] C R* and some 6 € S?. Prove that Ny = Ne. 
7.3 Prove the following result, sometimes called Karp’s theorem. Let Uso (2; 6), 


4,0 € S?, be the far field pattern and assume that there exists a function 
f:[-1,1) -C with 


Uco(#;0) = f(@-6) for all 2,6 € S?. 
Prove that the index of refraction n has to be radially symmetric: n = 
n(r). 
Hint: Rotate the geometry and use the uniqueness result. 
7.4 Show that for any a > 0 
ie 
max ——— dy = 2na’. 
Itlsa J |x—y| 
ly|<a 


Hint: Define u(x) as the volume integral. Apply Theorem 7.11 to show 
that u € C'(R?) satisfies Au = —47 for |x| < a and Au = 0 for |z| > a, 
and solve this elliptic equation explicitly by separation of variables. 
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7.5 Show that the characteristic functions f, from (7.94) for 7 = /n = 1/2 
and 7 = 2/3 have the forms 


k 
fij2(k) = sin? 3 ; ke Cs 
2 k 2 
fa/3(k) => 3 sin’ 5 [3 + 2008 ; keC, 


respectively. Discuss the existence of zeros of these functions and justify 
for these examples the assertions of Theorem 7.48. 

Hint: Use the addition formulas for the trigonometric functions to express 
f, in terms of sin(k/3) and cos(k/3). 


Appendix A 


Basic Facts from Functional 
Analysis 


In this appendix, we collect some of the basic definitions and theorems from 
functional analysis. We prove only those theorems whose proofs are not easily 
accessible. We recommend the monographs [151, 168, 230, 271] for a compre- 
hensive treatment of linear and nonlinear functional analysis. 


A.1 Normed Spaces and Hilbert Spaces 
First, we recall two basic definitions. 


Definition A.1 (Scalar Product, Pre-Hilbert Space) 
Let X be a vector space over the field K = R or K=C. A scalar product or 
inner product is a mapping 


(jx: XxX 3K 
with the following properties: 
(i) (a@+y,2)x = (x,2z)x +(y,2)x for allz,y,z EX, 


(ti) (av, y)x =a(a,y)x for allz,ye X anda€K, 


(iii) (2, y)x = (y,#)x for alla,y EX, 


(iv) (a,x) x ER and (a,2)x > 0, for allx Ee X, 


(v) (a, a)x >O0 ifa 40. 


A vector space X over K with inner product (-,-)x ts called a pre-Hilbert space 
over K. 
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The following properties are easily derived from the definition: 
(vii) (v,ay)x = @(a,y)x for allz,ye X andae K. 


Definition A.2 (Norm) 
Let X be a vector space over the field K = R or K=C. A norm on X is a 


mapping 
|-|lx: xX OR 
with the following properties: 
(i) \|z||x > 0 for alla € X with x £0, 
(tt) |laz||x =a] ||r||x for alla € X anda€ K, 
(iii) |la + yllx < llxl]x + |lyllx for all x,y € X. 
A vector space X over K with norm || - ||x is called normed space over K. 


Property (iii) is called triangle inequality. Applying it to the identities x = 
(a—y)+y and y = (y—2) +2 yields the second triangle inequality || — y||x > 
Helle — lvllx| for all 2,4 € X. 


Theorem A.3 Let X be a pre-Hilbert space. The mapping ||: ||x : X — R 


defined by 
Itx = V(@a)x, cEXx, 


is a norm; that is, it has properties (i), (ii), and (iii) of Definition A.2. Fur- 
thermore, 


(iv) |(x, y)x| < |la|lx|lyllx for all x,y € X (Cauchy—Schwarz inequality), 


(v) \la = yll = llell + lull + 2Re(a,y)x for all x,y eX 


(binomial formula), 


(vi) ||x + yll& + lle — yll& = 2llell + 2llyll for all x,y € X. 


In the following example, we list some of the most important pre-Hilbert 
and normed spaces. 


Example A.4 


(a) C” is a pre-Hilbert space of dimension n over C with inner product 
(x, y)2 = par TeV: 

(b) C” is a pre-Hilbert space of dimension 2n over R with inner product 
(x, y)2 = Re yea LYK: 


(c) R” is a pre-Hilbert space of dimension n over R with inner product 
(z,y)2 = Sonet LEYk- 
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(d) For p > 1 define the set ¢? of complex-valued sequences by 


P= {0 Dll < coh. (A.1) 


Then @? is a linear space over C because if (xz), (yz) € €?, then (Az,) and 
(az +yx) are also in €?. The latter follows from the inequality |x, + yx|? < 
(2max{|re|, |ye|})? < 2? (|x|? + |yx|?). 


a 1/p 
|zllee = (>. i) , G= (re) Eel, 
k=1 


defines a norm in ¢?. The triangle inequality in the case p > 1 is known 
as the Minkowski inequality. In the case p = 2, the sesquilinear form 


=u: 2= eve ch; 


defines an inner product on 7. It is well-defined by the Cauchy—Schwarz 
inequality. 


(e) The space Cla, b] of (real- or complex-valued) continuous functions on 
(a, b] is a pre-Hilbert space over R or C with inner product 


b 
(z,y)p2 := pombe, x,y € Cla, db]. (A.2a) 


The corresponding norm is called the Euclidean norm and is denoted by 


|z||z2 = (2,2) [vc \\Pdt, «eECla,d). (A.2b) 


(f) On the same vector space Ca, b] as in example (e), we introduce a norm 
by 


lo = max le(d)|, © € Clad), (A.3) 


that we call the supremum norm. 


(g) Let m € N and a € (0,1). We define the spaces C™|a, b] and C'™“[a, }] 


by 
rn a _ «xis m times continuously 
GaP a= {« eC a:8); differentiable on [a, \ 
(m)(¢) — alm) 
C™%Ta,b]) := {2 € C™ a, b] : sup ae ae) < oo, 
t#s |é — s|% 
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and we equip them with norms 


m — (k) 
Iz7I|om pmax [[2*” Iles » (A.4a) 
|x (£) — al")(5)| 
|t — s|> 


(A.4b) 


I 


|||] oma [| \!om + sup 
sAt 


Every normed space carries a topology introduced by the norm; that is, we 
can define open, closed, and compact sets; convergent sequences; continuous 
functions; etc. We introduce balls of radius r and center x € X by 


Biz,r) = {ye X:|ly—alx <r}, Bier] = {yeX: |ly—allx <r}. 
Definition A.5 Let X be a normed space over the field K = R or C. 


(a) A subset M C X is called bounded if there exists r > 0 with M C B(0,r). 
The set M C X is called open if for every x € M there exists ¢ > 0 such 
that B(a,e) C M. The set M C X is called closed if the complement 
X \ M is open. 


(b) A sequence (xp)~ in X is called bounded if there exists c > 0 such that 
lan \lx <c for allk. The sequence (xpz)% in X is called convergent if there 
exists x € X such that ||~ — xz||x converges to zero in R. We denote the 
limit by x = limg 5. Xz, or we write rz, > x ask + oo. The sequence 
(xp)p in X ts called a Cauchy sequence if for every « > 0 there exists 
NEN with ||am — g||x <€ for allm,k>N. 


(c) Let (x~)x be a sequence in X. A point x € X is called an accumulation 
point if there exists a subsequence (ax,,)n that converges to x. 


(d) A set M C X its called compact if every sequence in M has an accumula- 
tion point in M. 


Example A.6 

Let X = C[0,1] over R and z,(t) = t*, t € [0,1], k € N. The sequence (zx), 
converges to zero with respect to the Euclidean norm ||-||,2 introduced in (A.2b). 
With respect to the supremum norm ||- ||. of (A.3), however, the sequence does 
not converge to zero. 


It is easy to prove (see Problem A.1) that a set M is closed if and only if 
the limit of every convergent sequence (2), in M also belongs to M. The sets 


int(M) := {x € M: there exists « > 0 with B(x,c) c M} 
and 


closure(M) := {x € X : there exists (x,), in M with « = jim Et 
— 00 
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are called the interior and closure, respectively, of IM. The set M C X is called 
dense in X if closure(M) = X. 

In general, the topological properties depend on the norm in X as we have 
seen already in Example A.6. For finite-dimensional spaces, however, these 
properties are independent of the norm. This is seen from the following theorem. 


Theorem A.7 Let X be a finite-dimensional space with norms ||- ||, and || - |l2. 
Then both norms are equivalent; that is, there exist constants co > cy > 0 with 


cq|lalla < |lzllo < callalla for allae x. 


In other words, every ball with respect to || - ||, contains a ball with respect 
to || - ||z and vice versa. 
Further properties are collected in the following theorem. 


Theorem A.8 Let X be a normed space over K and M C X be a subset. 


(a) M is closed if and only if M = closure(M), and M is open if and only if 
M = int(M). 


(b) If M # X is a linear subspace, then int(M) = 0, and closure(M) is also 
a linear subspace. 


(c) In finite-dimensional spaces, every subspace is closed. 


(d) Every compact set is closed and bounded. In finite-dimensional spaces, 
the reverse is also true (Theorem of Bolzano—Weierstrass): In a finite- 
dimensional normed space, every closed and bounded set is compact. 


A crucial property of the set of real numbers is its completeness. It is also a 
necessary assumption for many results in functional analysis. 


Definition A.9 (Banach Space, Hilbert Space) 
A normed space X over K is called complete or a Banach space if every Cauchy 
sequence converges in X. A complete pre-Hilbert space is called a Hilbert space. 


The spaces C” and R” are Hilbert spaces with respect to their canonical 
inner products. The space Cla, b] is not complete with respect to the inner 
product (-,-)z2 of (A.2a)! As an example, we consider the sequence 2;(t) = t* 
for 0 <t <1 and 2;(t) = 1 for 1 <t <2. Then (x), is a Cauchy sequence 
in C[0,2] but does not converge in C[0,2] with respect to (-,-)z2 because it 
converges to the function 

0, ¢<1l, 
as { 1, 2S 1, 


that is not continuous. The space (C[a, J, | - ||), however, is a Banach space. 


Every normed space or pre-Hilbert space X can be “completed”; that is, 
there exists a “smallest” Banach or Hilbert space X, respectively, that extends 
X (that is, ||x||x = ||z||z or (z,y)x = (a, y)x, respectively, for all x,y € X). 
More precisely, we have the following (formulated only for normed spaces). 
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Theorem A.10 Let X be a normed space with norm ||: ||x. There exist a 
Banach space (X, || - |<) and an injective linear operator J: X + X such that 


(i) The range J(X) C X is dense in X, and 
(ii) || Jx||z = |la||x for all x € X; that is, J preserves the norm. 


Furthermore, X is uniquely determined in the sense that if X is a second space 
with properties (i) and (ti) with respect to a linear injective operator J, then 
the operator J J~! : J(X) > J(X) has an extension to a norm-preserving 
isomorphism from X onto X. In other words, X and X can be identified. 


We denote the completion of the pre-Hilbert space (Cc [a,b], ¢, >) 12) by L?(a, b). 
Using Lebesgue integration theory, it can be shown that the space L?(a,b) 
is characterized as follows. (The notions “measurable,” “almost everywhere” 
(a.e.), and “integrable” are understood with respect to the Lebesgue measure.) 
First, we define the vector space 


L’(a,b) := {ax: (a,b) 4 C: 2 is measurable and |x|? integrable}, 


where addition and scalar multiplication are defined pointwise almost every- 
where. Then L?(a,b) is a vector space because, for x,y € L7(a,b) and a € C, 
x+y and az are also measurable and az, + y € L7(a,b), the latter by the 
binomial theorem |(t) + y(t)|? < 2|a(t)|? + 2|y(t)|?.. We define a sesquilinear 
form on £?(a,b) by 

b 


(2,y) := froma, x,y € £L7(a,b). 


a 


(-,-) is not an inner product on £7(a,b) because (x, x) = 0 only implies that x 
vanishes almost everywhere; that is, that 2 € N, where NV is defined by 


= {xe L7(a,b): x(t) =0 ae. on (a,d)}. 
Now we define L?(a,b) as the factor space 
L?(a,b) := L7(a,b)/N 


and equip L?(a,b) with the inner product 


(x1. (0) 2 = ii a(t)yat, «© € [x], ye fe. 


Here, [2], [y] € L?(a,b) are equivalence classes of functions in £?(a,b). Then it 
can be shown that this definition is well defined and yields an inner product 
on L?(a,b). From now on, we write x € L*(a,b) instead of x € [a] € L?(a,b). 
Furthermore, it can be shown by fundamental results of Lebesgue integration 
theory that L?(a,b) is complete; that is, a Hilbert space and contains C/a, b] as 
a dense subspace. 
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Definition A.11 (Separable Space) 

The normed space X is called separable if there exists a countable dense subset 
M Cc X; that is, if there exist M and a bijective mapping 7 : N > M with 
closure(M/) = X. 


The spaces C”, R”, L?(a,b), and Ca, }] are all separable. For the first two 
examples, let M consist of all vectors with rational coefficients; for the latter 
examples, take polynomials with rational coefficients. 


Definition A.12 (Orthogonal Complement) 
Let X be a pre-Hilbert space (over K = R or C). 


(a) Two elements x and y are called orthogonal if (x,y) x = 0. 
(b) Let MC X be a subset. The set 
M+ := {xe X:(a,y)x =0 for all y € M} 
is called the orthogonal complement of M. 


M+ is always a closed subspace and M Cc (M 1)", Furthermore, A C B 
implies that Bt c At. 

The following theorem is a fundamental result in Hilbert space theory and 
relies heavily on the completeness property. 


Theorem A.13 (Projection Theorem) 

Let X be a pre-Hilbert space and V C X be a complete subspace. Then V = 
(vt). Every x € X possesses a unique decomposition of the form x =v+u, 
where v € V andw € V+. The operator P: X > V, «+> v, is called the 
orthogonal projection operator onto V and has the properties 


(a) Pu=v forv € V; that is, P? = P; 
(b) ||lc — Pallx < |la—v'||x for all v' EV. 


This means that Px € V is the best approximation of x € X in the subspace 
V. 


A.2 Orthonormal Systems 


In this section, let X always be a separable Hilbert space over the field K = R 
or C. 


Definition A.14 (Orthonormal System) 
A countable set of elements A = {xz : k = 1,2,3,...} ts called an orthonormal 
system (ONS) if 


(i) (ag, 2;)x =0 for allk A j and 
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(tt) |lax||x =1 for allk EN. 


A is called a complete or a maximal orthonormal system if, in addition, there 
is no ONS B with AC B and AF B. 


One can show using Zorn’s Lemma that every separable Hilbert possesses 
a maximal ONS. Furthermore, it is well known from linear algebra that every 
countable set of linearly independent elements of X can be orthonormalized. 


For any set AC X, let 


span A := {Some K, zr € A, n n| (A.5) 


k=1 
be the subspace of X spanned by A. 
Theorem A.15 Let A = {x, :k = 1,2,3,...} be an orthonormal system. Then 
(a) Every finite subset of A is linearly independent. 
(b) If A is finite; that is, A = {a, :k = 1,2,...,n}, then for every x € X 


there exist uniquely determined coefficients a, € K, k = 1,...,n, such 
that 
n 
x Ss Apel < |lc-—allx for alla € span A. (A.6) 
k=1 xX 
The coefficients a, are given by ay, = (@,2,)x fork =1,...,n 


(c) For every x € X, the following Bessel inequality holds: 
= 2 
Yel@,ae)x|" < Ilellk, (A.7) 


and the series \\7° (x, @)x@R converges in X. 
(d) A is complete if and only if span A is dense in X. 


(e) A is complete if and only if for alla € X the following Parseval equation 
holds: 


di l@,2e)xl? = (lel. (A.8) 
k=1 


(f) A is complete if and only if every x € X has a (generalized) Fourier 
expansion of the form 


a z, Lk) Xk, (A.9) 
k=1 
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where the convergence is understood in the norm of X. In this case, the 
Parseval equation holds in the following more general form: 


(x,y)x = > \(@,2%)x (Y,re)x- (A.10) 


This important theorem includes, as special examples, the classical Fourier 
expansion of periodic functions and the expansion with respect to orthogonal 
polynomials. We recall two examples. 


Example A.16 (Fourier Expansion) 

(a) The functions x;,(t) := exp(ikt)//2x, k € Z, form a complete system of 
orthonormal functions in L?(0,27). By part (f) of the previous theorem, every 
function « € L?(0,27) has an expansion of the form 


27 
i’ a 3 a 
a(t) = eo S- elt [eet as, 
k= 0 


where the convergence is understood in the sense of L?; that is, 


2 
dt —> 0 


1 : : 
x(t) — = S- ee [eo e**8ds 
= 0 


as M,N tend to infinity. Parseval’s identity holds the form 


20 
1 1 . 
Solas? = s-lelRa, ax = 5 false ds. (Atl) 
keZ Mt ia 


For smooth periodic functions, one can even show uniform convergence (see 
Section A.4). 


(b) The Legendre polynomials P,, k = 0,1,..., form a maximal orthonormal 
system in L?(—1,1). They are defined by 


k 


P,(t) = (1—#)*, t€(-1,1), KEN, 


8 ae 


with normalizing constants 


We refer to [135] for details. 


Other important examples will be given later. 
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A.3 Linear Bounded and Compact Operators 


For this section, let X and Y always be normed spaces and A: X + Y bea 
linear operator. 


Definition A.17 (Boundedness, Norm of A) 
The linear operator A is called bounded if there exists c > 0 such that 


||Aa||y < ellallx forallae x. 


The smallest of these constants is called the norm of A; that is, 


A 
Wie ae. 
=o [lll x 


(A.12) 


Theorem A.18 The following assertions are equivalent: 
(a) A is bounded. 
(b) A is continuous at x = 0; that is, x; + 0 implies that Ax; > 0. 
(c) A is continuous for every x € X. 


The space £(X, Y) of all linear bounded mappings from X to Y with the oper- 
ator norm is a normed space; that is, the operator norm has properties (i), (ii), 
and (iii) of Definition A.2 and the following: Let B € £L(X,Y) and A € L(Y, Z); 
then AB € L(X, Z) and ||A Bl|ccx,z) < ||Allcc,z) |Bllecx,y)- 


Integral operators are the most important examples for our purposes. 
Theorem A.19 (a) Let k € L?((c,d) x (a,b)). The operator 


b 
(Az)(t) = jae s)a(s)ds, t€(c,d), x€ L*(a,b), (A.13) 


is well-defined, linear, and bounded from L?(a,b) into L?(c,d). Furthermore, 


d b 
|Allc(z2(a,0),£2(c,4)) S J [inte syeasae. 


(b) Let k be continuous on [c,d] x [a,b]. Then A is also well-defined, linear, and 
bounded from Cla, b| into C[c, d] and 


b 
All, == |All pans Gia = k(t, s)| ds. 
[Allee = lAlle(ctaaycteay = max f [a(t,s)|ds 
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We can extend this theorem to integral operators with weakly singular ker- 
nels. We recall that a kernel k is called weakly singular on [a,b] x [a,b] if k is 
defined and continuous for all t, s € [a,b], £ # s, and there exist constants c > 0 
and a € [0,1) such that 


|k(t,s)| < clt—s|~° for allt,s € [a,b], t#s. 


Theorem A.20 Let k be weakly singular on [a,b]. Then the integral operator 
A, defined by (A.13) for [c, d] = [a, 6], 1s well-defined and bounded as an operator 
in L?(a,b) as well as in Cla, 0]. 


For the special case Y = K, we denote by X* := £L(X,K) the dual space of 
X. Often we write (¢,2) x» x instead of (a) for £€ X* and x € X and call 
(-,:)x*,x the dual pairing. The dual pairing is a bilinear form from X* x X into 
K. The space X** = (X*)* is called the bidual of X. The canonical embedding 
J:X — X**, defined by 


(Jae := (f,2)x*x, tEX, LEX", (A.14) 


is linear, bounded, one-to-one, and satisfies ||Jx||x~» = ||a||x for all x € X. 
We recall some important examples of dual spaces (where we write (-,-) for 
the dual pairing). 


Example A.21 
Let again K= Ror K=C. 


(a) The dual of K” can be identified with K” itself. The identification I : 
K” — (K”)* is given by (Iz, y) = ja1 x;y; for z,y € K”. 


(b) Let p > 1 and q > 1 with - + : = 1. The dual of @ (see Example A.4) 
can be identified with @%. The identification I : £4 + (£?)* is given by 


CoO 


(Iz,y) = S334; for « = (a;) € 7 andy=(y,;) El. 
j=l 


(c) The dual (¢')* of the space ¢' can be identified with the space (°° of 
bounded sequences (equipped with the supremum norm). The identifica- 
tion I : £° — (£')* is given by the form as in (b) for x = (xj) € £° and 
y = (yj) € &. 


(d) Let co Cc &* be the space of sequences in K which converge to zero, 
equipped with the supremum norm. Then c can be identified with ¢'. 
The identification I : (1 — c% is given by the form as in (b) for x € ¢! 
and y € Cp. 


Definition A.22 (Reflexive Space) 

The normed space X is called reflexive if the canonical embedding of X into X** 
is surjective; that is, a norm-preserving isomorphism from X onto the bidual 
space X**, 
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The spaces £? for p > 1 of Example A.21 are typical examples of reflexive 
spaces. The spaces £!, £°, and cp fail to be reflexive. 

The following important result gives a characterization of X* in Hilbert 
spaces. 


Theorem A.23 (Riesz) 

Let X be a Hilbert space. For every x € X, the functional l,(y) := (y,x)x, 
y € X, defines a linear bounded mapping from X to K; that is, 0, € X*. 
Furthermore, for every € € X* there exists one and only one x € X with 
e(y) = (y,x)x for ally € X and 


£ 
Idx = sup Mov) 
40 lyllx 


= |lellx. 


This theorem implies that every Hilbert space is reflexive. It also yields 
the existence of a unique adjoint operator for every linear bounded operator 
A: X —>Y. We recall that for any linear and bounded operator A: X > Y 
between normed spaces X and Y the dual operator A* : Y* + X™* is defined as 
T*€ = €0A for all 2 € Y*. Here £0 A is the composition of A and £; that is, 
(Co A)a = ¢(Ax) for x € X. 


Theorem A.24 (Adjoint Operator) 
Let A: X —>Y be a linear and bounded operator between Hilbert spaces. Then 
there exists one and only one linear bounded operator A* : Y —+ X with the 
property 

(Az,y)y = (a, A*y)x forallaEe x, yEY. 


This operator A* : Y —+ X is called the adjoint operator to A. For X =Y, 
the operator A is called self-adjoint if A* = A. 


Example A.25 
(a) Let X = L?(a,b), Y = L?(c,d), and k € L?((c,d) x (a,b)). The adjoint A* 
of the integral operator 


b 
(Az)(t) = [ees) a(s)ds, té(c,d), x € L*(a,b), 
is given by 
(A*y)(t) = ico y(s)ds, t€(a,b), yeL (c,d). 


(b) Let the space X = Ca, b] of continuous function over C be supplied with 
the L?-inner product. Define f,g : Cla, b] > C by 


b 
f(a) := [eo dt and g(x):=2(a) for x € Cla, 6). 
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Both f and g are linear. f is bounded but g is unbounded. There is an extension 
of f to a linear bounded functional (also denoted by f) on L?(a,b); that: is, 
f € L?(a,b)*. By Theorem A.23, we can identify L?(a,b)* with L?(a, b) itself. 
For the given f, the representation function is just the constant function 1 
because f(x) = (a,1)z2 for x € L?(a,b). The adjoint of f is calculated by 


b 


f(a)-9 = / w(t) ydt = (a,y)z2 = (0, f(y): 


a 


for all « € L?(a,b) and y € C. Therefore, f*(y) € L?(a,b) is the constant 
function with value y. 

(c) Let X be the Sobolev space H'(a,b); that is, the space of L?-functions that 
possess generalized L?-derivatives: 


H'(a b) = {« c (a ve there exists a € K and y € L?(a,b) with \ 


x(t)=a+ fi y(s) ds for t € (a, 6) 


We denote the generalized derivative y € L?(a,b) by 2’. We observe that 
H'(a,b) C Cla, b| with bounded embedding. As an inner product in H'(a,b), 
we define 


(2,y)m c= a(a) y(a) + (@',y)r2, 2,y€ H'(a,b). 


Now let Y = L?(a,b) and A: H'(a,b) —> L?(a,b) be the operator x ++ 2’ for 
x € H*(a,b). Then A is well-defined, linear, and bounded. It is easily seen that 
the adjoint of A is given by 


t 


(A*y)(t) = | ulsas, te(a,b), ye (a,b). 


a 


In the following situation, we consider the case that a Banach space V is 
contained in a Hilbert space X with bounded imbedding 7 : V — X such that 
also j(V) is dense in X. We have in mind examples such as H1(0,1) c L?(0,1). 
Then the dual operator 7* is a linear bounded operator from X* into V* with 
dense range (the latter follows from the injectivity of j). Also, j* is one-to-one 
because j(V) is dense in X, see Problem A.3. Now we use the fact that X and 
X* are anti-isomorphic by the Theorem A.23 of Riesz; that is, the operator 
jeri X >ur bl, © X* where é,(z) = (z,2)x for z € X is bijective and 
anti-linear; that is, satisfies jp(Av + wy) = Ajrz + Djry for all x,y € X and 
A, € K. Therefore, also the composition 7* o jp : X — V* is anti-linear. For 
this reason we define the anti-dual space V’ of V by 


V’ = V*¥ = {4:V>K:lev*} (A.15) 


where @(v) = @(v) for all v € V. Then the operator j’ := j* 0 jp: X > V' is 
linear and one-to-one with dense range, thus an imbedding. In this sense, X is 
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densely imbedded in V’. We denote the application of € € V’ to v € V by (£,v) 
and note that (¢,v) + (é,v) is a sesquilinear form on V’ x V. (It should not 
be mixed up with the dual pairing (-,-)v*,v : V* x V > K which is bilinear.) 
From this analysis, we conclude that (x,v)x = €2(v) = (j’a,v) for alla € X 
and v € V and thus 


|(x,v)x| < |lj’a|ly lolly forallaexX,vev. (A.16) 


Definition A.26 (a) A Gelfand triple (or rigged Hilbert space, see [100]) VC 
X CV’ consists of a reflerive Banach space V, a separable Hilbert space X, 
and the anti-dual space V’ of V (all over the same field K = R or K=C) such 
that V is a dense subspace of X, and the imbedding 7 : V © X is bounded. 
Furthermore, the sesquilinear form (-,-) : V’ x V + K is an extension of the 
inner product in X; that is, (x,v) = (a,v)x forallueV andre X. 
(b) A linear bounded operator K : V' — V is called coercive if there exists 
y > 0 with 
|(z,Kx)| > yllall for allxe Vv’. (A.17) 


The operator K satisfies Garding’s inequality if there exists a linear compact 
operator C': V' — V such that K + C is coercive; that is, 


[(z,(K+C)z)| > y\lall} for alleev’. 


By the same arguments as in the proof of the Lax—Milgram theorem (see 
[129]), it can be shown that every coercive operator is an isomorphism from 
V’ onto V. Coercive operators play an important role in the study of partial 
differential equations and integral equations by variational methods. Often, the 
roles of V and V’ are interchanged. For integral operators that are “smoothing”, 
our definition seems more appropriate. However, both definitions are equivalent 
in the sense that the inverse operator K~! : V + V’ is coercive in the usual 
sense with y replaced by WE \leqv)- 

The following theorems are two of the most important results of linear func- 
tional analysis. 

Theorem A.27 (Open Mapping Theorem) 

Let X,Y be Banach spaces and A: X > Y a linear bounded operator from X 
onto Y. Then A is open; that is, the images A(U) C Y are open in Y for all 
open sets U C X. In particular, if A is a bounded isomorphism from X onto 
Y, then the inverse A~! : Y -+ X is bounded. This result is sometimes called 
the Banach-Schauder theorem. 


Theorem A.28 (Banach-Steinhaus, Principle of Uniform Boundedness) 
Let X be a Banach space, Y be a normed space, I be an index set, and Ag € 
L(X,Y), a ET, be a collection of linear bounded operators such that 


sup ||Aaa|ly < co for everyxEe X. 
acl 


Then SUPger |Aallecxy) =e 
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As an immediate consequence, we have the following. 


Theorem A.29 Let X be a Banach space, Y be a normed space, DC X be a 
dense subspace, and A, € L(X,Y) forn EN. Then the following two assertions 
are equivalent: 


(i) Anx + 0 as n— oo for alla Ee X. 


(i) supyen l|Anllccx,v) < 00 and A,x > 0 as n> oo for all x € D. 


We saw in Theorem A.10 that every normed space X possesses a unique com- 
pletion X. Every linear bounded operator defined on X can also be extended 
to X. 


Theorem A.30 Let X,Y be Banach spaces, X C X a dense subspace, and 
A: X —Y be linear and bounded. Then there exists a linear bounded operator 


A:X >Y with 
(i) Ax = Ax for all x € X; that is, A is an extension of A, and 
(i) || All ecx,xy = WAllecxy)- 

Furthermore, the operator A is uniquely determined. 


We now study equations of the form 
za — Kz = y, (A.18) 


where the operator norm of the linear bounded operator K : X — X is small. 
The following theorem plays an essential role in the study of Volterra integral 
equations. 


Theorem A.31 (Contraction Theorem, Neumann Series) 
Let X be a Banach space over R or C and K : X > X be a linear bounded 
operator with 


limsup [Rey = ls (A.19) 


Then I—K is invertible, the Neumann series yo K” converges in the operator 
norm, and 


SK = (-K)”. 
n=0 


Condition (A.19) is satisfied if, for example, ||K™||ccx) < 1 for some m EN. 


Example A.32 
Let A:= {(t,s) € R?:a<s<t< bd}. 
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(a) Let k € L?(A). Then the Volterra operator 
t 
(Ka)(t) := [ets) a(s)ds, a<t<b, r€L*(a,b), (A.20) 


is bounded in L?(a,b). There exists m € N with ||K™||c(12(a,b)) < 1. The 
Volterra equation of the second kind 


t 


a / NG Verde =a, Gear. (A.21) 


a 


is uniquely solvable in L?(a,b) for every y € L?(a,b), and the solution x 
depends continuously on y. The solution x € L?(a,b) has the form 


t 


we HO + / ai auede: Lela), 


a 


with some kernel r € L?(A). 


(b) Let k € C(A). Then the operator K defined by (A.20) is bounded in 
Cla, b], and there exists m € N with ||K™ ||. < 1. Equation (A.21) is 
also uniquely solvable in C[a, 6] for every y € C[a,}], and the solution « 
depends continuously on y. 


For the remaining part of this section, we assume that X and Y are normed 
spaces and kK : X > Y a linear and bounded operator. 


Definition A.33 (Compact Operator) 
The operator kK : X + Y is called compact if it maps every bounded set S into 
a relatively compact set K(S). 


We recall that a set M C Y is called relatively compact if every bounded 
sequence (y;); in M has an accumulation point in closure(M); that is, if the 
closure closure(/) is compact. The set of all compact operators from X into 
Y is a closed subspace of £(X,Y) and even a two-sided ideal by part (c) of the 
following theorem. 


Theorem A.34 (a) If ky and Kz are compact from X into Y, then so are 
Ky, + Ko and AK, for every \ € K. 


(b) Let Ky, : X —+ Y be a sequence of compact operators between Banach 
spaces X and Y. Let kK : X —+ Y be bounded, and let K, converge to K in 
the operator norm; that is, 

, ||K,a2 — Kally 


|| = K|lecx,y) := sup > 0 (n > 00) . 
#0 [Ix 
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Then K is also compact. 


(c) If LE L(X,Y) and K € L(Y,Z), and L or K is compact, then KL is also 
compact. 


(d) Let Ay, € L(X,Y) be pointwise convergent to some A € L(X,Y); that is, 
Anx > Ax foralla eX. IfK : Z— X is compact, then ||AnK—AK||c(z.v) > 
0; that is, the operators A,K converge to AK in the operator norm. 


(e) The identity operator x +> x is compact as an operator from X into itself 
if, and only if, X is finite dimensional. 


(f{) Every bounded operator K from X into Y with finite-dimensional range is 
compact. 


Theorem A.35 (a) Let k € L?((c,d) x (a,b)). The operator K : L?(a,b) > 
L? (c,d), defined by 


(Ka)(t) = [ees) a(s)ds, t€(e,d), «x € L(a,b), (A.22) 


is compact from L?(a,b) into L?(c,d). 


(b) Let k be continuous on [c, d] x [a, 6] or weakly singular on [a, b] x [a, b] (in this 
case [c,d] = |a,b|). Then K defined by (A.22) is also compact as an operator 
from Cla, 6] into Cc, dj. 


We now study equations of the form 
x—- Kr = y, (A.23) 


where the linear operator kK : X — X is compact. The following theorem 
extends the well-known existence results for finite linear systems of n equations 
and n variables to compact perturbations of the identity. 


Theorem A.36 (Riesz) 
Let X be a normed space and K : X — X be a linear compact operator. 


(a) The null space NU — K) = {x € X : 2 = Kz} is finite-dimensional and 
the range R(I — K) = (I — K)(X) is closed in X. 


(b) If I — K is one-to-one, then I — K is also surjective, and the inverse 
(I — K)~! is bounded. In other words, if the homogeneous equation x — 
Ka = 0 admits only the trivial solution x = 0, then the inhomogeneous 
equation x — Ka = y is uniquely solvable for every y © X and the solution 
x depends continuously on y. 


The next theorem studies approximations of equations of the form Ax = y. 
Again, we have in mind that A=J— K. 
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Theorem A.37 Assume that the operator A: X + Y between Banach spaces 
X and Y has a bounded inverse A~'. Let An € L(X,Y) be a sequence of 
bounded operators that converge in norm to A; that is, || An — Al|ccx,vy) > 0 as 
n— oo. Then, for sufficiently large n, more precisely for all n with 


|A-*(An — A)llexx) < 1, (A.24) 
the inverse operators A,+:Y — X exist and are uniformly bounded by 


= Aq" lea.x) 
A; < : 
PR NGOS ACA, lace 


(A.25) 


For the solutions of the equations 
Az = y and Antn= Yn, 
the error estimate 
lan —2l|x < cf{|| Ana —Aally + lyn —ylly} (A.26) 
holds with the constant c from (A.25). 


A.4 Sobolev Spaces of Periodic Functions 


In this section, we recall definitions and properties of Sobolev (Hilbert) spaces 
of periodic functions. A complete discussion including proofs can be found in 
the monograph [168]. 
From Parseval’s identity (A.11), we note that « € L?(0,27) if and only if the 
Fourier coefficients 

27 


an = x fas) eds, keZ, (A.27) 
0 


are square summable. In this case 
1 
Dla? = = [ellzs. 
keZ 


If a is periodic and continuously differentiable on [0,27], partial integration of 
(A.27) yields the formula 


27 


ap = EO e **s ds; 
0 
that is, ikay, are the Fourier coefficients of x’ and are thus square summable. 
This motivates the introduction of subspaces H7.,.(0,27) of L?(0,27) by requir- 
ing for their elements a certain decay of the Fourier coefficients az. In the 
following we set 


we(t) = e* fork € Zandt € (0,2m). 
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Definition A.38 (Sobolev space of periodic functions) 
For r > 0, the Sobolev space H7..,.(0,2m) of order r is defined by 


Ayer (0,27) i= {« = Sax we € L? (0,27) : SG +k)" |ax|? < co} : 


keZ keZ 


We note that H®,,.(0,27) coincides with L?(0,2r). 


per 
Theorem A.39 The Sobolev space 
product defined by 


Hy er(0,27) is a Hilbert space with the inner 


(x, y)ar., = SA+k)" ag be, (A.28) 
keZ 
where © = Yo pe7, Ak we and y = enez OK we. The norm in Ayer (0,27) is given 
by 


1/2 
zl. = (34+ #7) a 


keZ 


The Sobolev space H?.,.(0,27) is dense in L?(0, 2m). 


We note that ||z||p2 = V27 ||2||qo,,, that is, the norms ||z||z2 and ||z|| 70 


per 


are equivalent on L?(0, 27). 


Theorem A.40 (a) For r € No := NU {0}, the space C?.,.[0,27] = {x € 
CO" [0,27]: a is 2m — periodic} is boundedly embedded in 


Hogs 2m). 
(b) The space T of all trigonometric polynomials 


n 


_—— {> Ak Wk 1 an € C, nen} 


k=—n 


is dense in H?..(0, 27) for every r > 0. 


per 


We consider H}.,.(0,27) C L?(0,2m) as a Banach space (that is, forget 
the inner product for a moment) with bounded imbedding 7 : Hj.,(0,27) 
L?(0,27) which has a dense range. Therefore, we can consider the correspond- 
ing Gelfand triple H?.,.(0,27) C L°(0,27) C Hf.,.(0,2m)’ where H?.,.(0, 27)’ 
denotes the space of all anti-linear functionals on 0, 277), see Definition A.26. 
We make the following definition. 


Definition A. rs For r > 0, we denote by H,-).(0,27) = Hj.,(0,27)' the anti- 
dual space of H7.,(0,27); that is, the space of all anti-linear bounded functionals 
on Hy.,.(0,2m). Then Hf.,.(0, Qn) C 17(0,2m) C H5.(0, 27) with bounded and 
naa imbeddings. The corresponding sesquilinear form (-,-) : 0,27) x 


(0,27) > C extends the inner product in L?(0,27); that is, 


Hien 


Fs pope 
ier 


27 


Wd) = (0n = / w(t) Hh) dt 
0 


for all» € L7(0,27) and dE 0, 27). 


Terk 
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The following theorems give characterizations in terms of the Fourier coef- 
ficients. 


Theorem A.42 Let again w(t) = e*** fork © Z and t € (0,2r). 


(a) Let € © HyZ(0,27) = Hy.,(0,2m)’ and define cy := (0, x) for k € Z. 


Then 
1/2 
Ilage. = (S0+#)" la?) 


keZ 


and 


= Cray for alla = S- ane € H on (Uy 270) (A.29) 


kEZ kEZ 
(b) Conversely, let cy, € C satisfy 


SOL +k) |x|? < oo. 

keEZ 
Then ¢, defined by (A.29), is in H7(0,2m) with Dyeg(1 + k*)~ |cel? = 
lear: 


Proof: (a) Set zN = Dei<n Ce(L + k?2)—"bp for N € N. Then (note that @ 
is anti-linear) 


YH RT ea? = G2) < Mllags lela, 
[kI<N 
= Wllnss. | Do + #2)? lexl2( + #2) 
[AI SN 
= Whigs, | 2) 0+) lee? 
|kI<N 


which proves )>,¢7(1 + k?)~" |cx|? < lene by letting N tend to infinity. In 
0, 277) 
we set aN = RN ark, and have that (é,2%) = Djki<w Gk Ck and thus 


particular, the series converges. Furthermore, for « = }0,,¢7 anWr € Heck 


(2), = | SS aren] SSO Jaw + #7)? |eal + PY? 
|k|<N |k|<N 
< S> laxl2(1 + 2)" | $2 [egl2(1 + &?)- 
kN |k|<N 


eM ae | Do lewl2(1 + &?)- 
[RISN 


A.4 Sobolev Spaces of Periodic Functions 335 


and thus also (¢,7) = }>,<¢7 Gm cr and ||é| Her. < < Vyeg(h +h?) |cx|? by letting 
N tend to infinity. 


(b) This is shown in the same way. 


This theorem makes it possible to equip the Banach space H 
0,27)’ with an inner product and make it a Hilbert space. 


per (0; 27) _ 


Hee ( 


Theorem A.43 Let r > 0. On L?(0,27), we define the inner product and norm 


by 
(a,y)-- := S048?) ag bk, (A.30a) 
keZ 
lcl_r = ,fS 0 +k?)-* lag ?, (A.30b) 
keZ 


pains where £ = Do pez Gk Wr and y = i xpez OK Wr. Then the completion 
" (0,27) of L?(0,27) with respect to ||-||_, can be cae with H-;(0, 27). 
(0,27), where 


nee 


The isomorphism is given by the extension of J : L?(0,27) > 


rs ees 


(Jz,y) := So ax be forx= S- apt, € Ayer(0, 277) 


keZ keZ 


and y = Yonez byt, € L?(0,2m). Therefore, we identify ||p||_, with \|Jp|ly-" 


per 


and simply write ||p||,.- - 


per 


Proof: First we show that Jax € H,. 


per 


(0, 27). Indeed, by the Cauchy—Schwarz 
inequality, we have for y= pez bebe € Ayer (0, 27) 


KFa,y)| SDI + RP) lag} (1 + #7) [bel } 


keZ 
1/2 1/2 
< (Su +0)? So A+?) bel? 
keZ |k|<N 
i 


= Ie llasn Ilyllr » 


and thus Ja € H;. 


per 


” (0,27) with Fr or < ||x||_, for all x € L7(0,27). By the 
previous theorem, applied to = Jz, we have Jel ean = = Vez +h?) le? 
with cy, = (Jax, Wp) = ay. Therefore, | Jal por = ||x||_,, and J can be extended 
to a bounded operator from A Hy (0,27) into H;-7.(0,27) (by Theorem A.30). 
It remains to show that J is surjective. Let ¢ € H5.(0,27) = Hj.,(0,27)! and 
define c, = (0, wr) and aN = RIS ChBE for N € N. For y = Yopez bebe E 


Hy..,(0, 2m) we have JaN = Dyai<n Oe Ck = (f, Dieicw bye) which converges 


336 Appendix A 


to (¢,y). Furthermore, (2%) is a Cauchy sequence with respect to ||-||_, because 
of the convergence of )>,¢7(1+k?)~" |ck|? by the previous theorem, part (a). 


Theorem A.44 (a) Forr > s, the Sobolev space H 
of Hye,(0,27). The inclusion operator from H. 
compact. 


(b) For allr > 0 and a € L?(0,2m) and y € H}.,.(0,27), there holds 


Ther (0, 27r) is a dense subspace 
(0,27) into H®..(0,27) is 


nee per 


< 20 lel > llyllare (A.31) 


per per 


(x,y) z2| = 2r|( X,Y) HO 


per 


We note that the estimate (A.31) is in accordance with (A.16) because, with 
the imbedding j’ from L?(0,27) into H;.” (0,27) and the identification J from 


per 
Hy2(0, 27) onto H;-'.(0,27) of the previous theorem we show easily that the 
imbedding of L?(0, 277) into Fig Os 27) is given by J~1o 7’ where (J~!0j')x 
2nrx for x € L?(0, 27). 


Proof: (a) Denseness is easily seen by truncating the Fourier series. 
Compactness is shown by defining the finite-dimensional operators Jy from 
Aper(0,2m) into Hper(0,2m) by Inz = Vinjcn Gee where © = Dye GeVe- 
Then Jy is compact by part (f) of Theorem A.34 and 


1 
2. _ 2\s 2 2\r 2 
I|Jva — xllazs_, = S- (1+ k*)*lag/" << G+ N28 S- (1+ k*)"|ax| 
|k|>N |k|>N 
1 


< 
< saeyllelhe., 
Therefore, Jy converges in the operator norm to the imbedding J which implies, 
again by Theorem A.34, that also J is compact. 


(b) This is seen as in the proof of Theorem A.43 with the Cauchy—Schwarz 
inequality. 


Theorems A.40 and A.43 imply that the space 7 of all trigonometric poly- 
nomials is dense in H7,.,.(0, 2m) for every r € R. Now we study the orthogonal 
projection and the interpolation operators with respect to equidistant knots and 
the 2n-dimensional space 


n-1 
Te = S- Qk Wk i Op € c| (A.32) 


k=—-n 
where again 7;,(t) = e'** for t € [0,2m] and k € Z. 
Lemma A.45 Let r,s €R with r > s. 


(a) The following stability estimate holds 


ll2n\lan < en™* ||zn|l as for all zn € Tn - 


per — per 
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(b) Let Py, : L?(0,27) + Tr C L?(0, 27) be the orthogonal projection operator. 
Then Py, is given by 


n-1 
Prt = > arte, x €17(0,2n), (A.33) 
k=—-n 
where 
l 20 
ah = af elsyet as, keZ, 


are the Fourier coefficients of x. Furthermore, the following estimate 
holds: 


lla — Prat|larg 


per 


1 
< as It llar,. for all x € Hy.,(0,27) , (A.34) 


per 


Proof: (a) With zy = S277! agit, this follows simply by 


k=—-n 


llenllze SDSL +R Yael? = SO (1+ #8 + BP)? la? 


ad lkl<n \k|<n 
< (L+n?) lene, S On) llenlBe,, - 
(b) Let = Vopez ae wr € L?(0,2m) and define the right-hand side of (A.33) 
by z; that is, z = Sy eae ak Wr € Tn. The orthogonality of Wr implies that 
x — z is orthogonal to 7,,. This proves that z coincides with P,a. Now let 


x € Hf..(0,27). Then 
ln — Paalie,, << SD +82) law? 
|k|>n 
— S- (1 1 ye re ah k?)" |a,.|?] 
|k|>n 


IA 


(14.0?) fall < n26-7 flalldee 


Now let t; :-= j 5,7 =0,...,2n—1, be equidistantly chosen points in [0, 27]. 
Interpolation of smooth periodic functions by trigonometric polynomials can be 
found in numerous books as, for example, in [72]. Interpolation in Sobolev 
spaces of integer orders can be found in [42]. We give a different and much 
simpler proof of the error estimates that are optimal and hold in Sobolev spaces 
of fractional order. 


Theorem A.46 For every n € N and every 27-periodic function x € C[0, 27], 
there exists a unique Pp € Tp with x(t;) = ppr(t;) for all j =0,...,2n—1. The 
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trigonometric interpolation operator Qn : Cper{0, 2a] = {a € C[0,2m] : x(0) = 
(27) } + Tr has the form 


2n-1 


Q,.0 = S x(ty) Ly 


k=0 
with Lagrange interpolation basis functions 


n—1 
a ' 
L(t) = — eim(t—te) | k= 0,...,2n—1. (A.35) 
2n Pca 
The interpolation operator Q, has an extension to a bounded operator from 
Ayer(0, 2m) into Tn C Hj.,(0,2m) for all r > 5. Furthermore, Qn obeys esti- 
mates of the form 


lz — Qn2z|lne,, < |cllar,, for all x € H;.,(0, 2m) , (A.36) 


nr—s 


whereOQ <s<randr> 3. The constant c depends only on s and r. In 
particular, Qn llccre.,.(0,20)) is uniformly bounded with respect to n. 

Proof: The proof of the first part can be found in, for example, [168]. Let 
x(t) = Vo ez 4m exp(imt). Direct calculation shows that for smooth functions 


x the interpolation is given by 


n-1 
(Qnx)(t) = S- aje"" with 
j=—n 
1 2n—1 
a; = om a(th)e 9"/" | fj =—n,...,n—-1. 
™ 50 


The connection between the continuous and discrete Fourier coefficients is sim- 
ply 


1 2n-1 
aj = oe . , digs eimkn/n—ijkn/n 
2n 
k=0 meZ 
2n—1 
1 : : k 
i(m—j)n/n 
= = Gia > fe" Dele) = 5 Aj+2ne 
meZ k=0 leZ 


where t = )nez Om Pm: It is sufficient to estimate P,x — Qnx because the 
required estimate holds for « — P,,x by formula (A.34). We have 


n-1 
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and thus by the Cauchy—Schwarz inequality 


| Pra — Qn2llizs., 


n—1 n-1 2 
= S- lam — @m|?(1+m?)® < ens = S/ am+one 
m=—n m=—n! LA~0 
ca /2 1 
= en 1 nO Gaia 
p» dll ) +2 | a he (m+ 2ne)2)"? 
n—-1 
1 
< ens 1+ (m+ 2ne)?)" lamtonel? “|: 
< en p>) > (m + 2nl)”)"|am-+2ne| 2 TL Gt Bal) 


From the obvious estimate 
—2r 
Sv( + (m+ 2né)?)"" < (2n)-?" S- (= + e) < en?" 
£40 £40 


for all |m| <n and n EN, we conclude that 


n—1 
en2e-") S- Sv + (m+ 2nl)?)" |am+onel? 


m=—n lA~A0 


<n) flrl[Pre 


| Pra — Qn2llizs., 


IA 


For real-valued functions, it is more convenient to study the orthogonal 
projection and interpolation in the 2n-dimensional space 


n n—1 
> a; cos(jt) + :S b; sin(jt) : aj,b; ER 
j=0 j=l 


In this case, the Lagrange interpolation basis functions L, are given by (see 
[168]) 


L(t) = z{1 + cy anes + cost ty} (A.37) 


m=1 


k =0,...,2n—1, and the estimates (A.34) and (A.36) are proven by the same 
arguments. 


Theorem A.47 Let r € N and k € C"((0,27] x [0,27]) be 27-periodic with 
respect to both variables. Then the integral operator kK, defined by 


(Ka)(t) := ic s)a(s)ds, t€ (0,27), (A.38) 
0 


can be extended to a bounded operator from H?.,(0,27) into Hj.,(0,27) for 
every —r Spr. 
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Proof: Let « € L?(0,27). From 


20 
d _ O k(t, s) _ 
(Kali) = [GE we, PS UhoiS 
0 


we conclude from Theorem A.44 that for « € L?(0, 27) 


di 


sp Ke\() < 2n | Re 


al 
Ot Hr, P 


dts 


and thus 
Kalla, < c1||Kallor < elzll_z—-~ 


per = per 


for all x € L?(0,27). Application of Theorem A.30 yields the assertion because 
L? (0,27) is dense in H;." (0,27). 


per 


A.5 Sobolev Spaces on the Unit Disc 


Let B = {x € R’: |a2| < 1} be the open unit disc with boundary 0B. In 
this section, we consider functions from B into C that we describe by Carte- 
sian coordinates x = (21,22) or by polar coordinates (r,y). Functions on the 
boundary are identified with 27-periodic functions on R. As in the case of the 
Sobolev spaces H}.,.(0,27) we define the Sobolev space H'(B) by completion. 


Definition A.48 The Sobolev space H'(B) is defined as the completion of 
C™°(B) with respect to the norm 


flay = le@r +iv7@PI da. (A.39) 


B 


We express the norm (A.39) in polar coordinates (r,y). The gradient is 
given in polar coordinates as 


6) 10 
ve(rg) = Ares 4 SMe og, 


cos yp 


: dg= 
sin > and Y 
expand the function f(r,-) (formally) into a Fourier series with respect to y: 


f(r,y) = S- f(r) eo 


meZ 


where * = ( a) denote the unit vectors. We fix r > 0 and 


cos 


with Fourier coefficients 
27 


fal?) = ae f fest erim at, meZ, 


0 
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that depend on r. Therefore, 


SAP) = pine, oe = iT falr) 


meZ meZ 


The norm in H1'(B) is given by 


IflGaw = =f [Oo aT P+ LalryP] rar, (8.40) 


meZ 


because : 
Of(r,¢) 
Or 


Vir.) = | 


and 


2 


dp = 2x S> |fm(r))? 


meZ 


/ |x eime 


meZ 


To every function f € C*°(B), one can assign the trace f|az on OB. We denote 
this mapping by 7, thus 7 : C®(B) > C%°(0B) is defined as rf = flag. The 
following result is central. 

Theorem A.49 (Trace Theorem) 

The trace operator tT has an extension to a bounded operator from H'(B) to 
H*/?(0B), where again H‘/?(B) is identified with the Sobolev space Hpdz (0, 27) 
of periodic functions (see Definition A.38). Furthermore, t : H'(B)+H*'/?@B) 
is surjective. More precisely, there exists a bounded linear operator E_ : 
H*/?(0B) > H'(B) with ro E =I on H‘/?(QB) (that is, E is a right inverse 
of T). 


Proof: Let f € C®(B). Then (see (A.40) and (A.28)) 


linge) = Ef [oe ae VP + fale) |?) rar 
meZ 
II FllF2/2(0B) = be Vi+m? |fm( 


meZ 


We estimate, using the fundamental theorem of calculus and the inequality of 
Cauchy—Schwarz, 
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; d 
fat) |? = a (r?| fm(r) |?) dr 
ie 


1 
a3 r 27 r r) fl (r)r? Tr 
= 2 fm d + 2Re f fal) BO d 


IA 

i) 
Oa 3 

a 

3 


1 1 
r)|? rdr + 2 [im(r2r? ar [lin(erar. 
0 0 


Using the inequality 2ab < V1 + m2a? + tas yields 
1 
VIF MEL fm) < avi WE f \fn(r)P rar 
0 
1 1 
+ (tm) | Mini? +f \fia(r)|? rar 
0 
1 
< 3(1+m?) | oto 2dr + [ \tatryra 
< 


of (oe) mtn trae + [Ut ? rdr, 
0 


where we have also used the estimates r? < r for r € [0,1] and /1+m? < 
1+ m?. By summation, we conclude that 


3 ——— 
Ir Fllze2/2¢0B) < On Il fllF¢B) for all f EC (B) i (A.41) 


Therefore, the trace operator is bounded with respect to the norms of H'(B) 
and H'/?(@B). By the general functional analytic Theorem A.30, the operator 
has an extension to a bounded operator from H1+(B) into H'/?(0B). 


We define the operator E : C° (0B) > C™(B) by 


(Ef)(r,~) = So fmrime’™?, re [0,1], ye [0,27]. 
mez 


Here, again, f,, are the Fourier coefficients of f € C°(OB). 
Obviously, (rEf)(~) = Vo ,ez fme’® = f(~); that is, E is a right inverse of r. 
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It remains to show the boundedness of EF. 


WE) = 2x Xf [(e MS) nl ltr 23a 
meZ 9 
= 2D Mink? (Sos ve) < 2m Do [fral*(1 + ml) 
meZ meZ 
< V220 5 Litral? Vl+m = V22n||f\lz1/2( dB) 
meZ 


where we used the inequality 1+ |m| < /2V/1+m?. Therefore, E also pos- 
sesses an extension to a bounded operator from H'/?(9B) to H'(B). 


Remark: The trace operator is compact when considered as an operator 
from H'(B) to L?(OB) because it is the composition of the bounded operator 
7: H'(B) > H'/?(0B) and the compact embedding j : H'/?(0B) > L?(QB). 


We now consider the subspaces 
L2(0B) = {f € L?(OB) : | fdl= of ; 
aB 


H3’°(0B) = {fen (0B): pat=oh, 


OB 


He(By) = {reHe): [ rrar=oh 


Because ae exp(imy) dp = 0 for m # 0, the spaces He'? (8B) and H}(B) 
consist exactly of the functions with the representations 


f(y) = Do fme'™? and 


meZ 
m0 


S- come ame, 


meZ 


f(r, 2) 


I 


that satisfy the summation conditions 


So V1 +m? | fm? < and 


meZ 
m0 


E/IC Te) Lim? + Ufm(PIP] rar < oo and f(t) =0, 


meZ 
respectively. 


We can define an equivalent norm in the subspace H}(B). This is a consequence 
of the following result. 
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Theorem A.50 (Friedrich’s Inequality) 
For all f € H3(B), we have 


IIfllzze) < V2\VFllz2ce)- (A.42) 
Proof: Again, we use the representation of the norm in polar coordinates: 


rim =f £(slfm(s)l4) as 


0 
7 


= J lim(s)Pas + 2Re | f(s) Fl) ds 
0 


0 


< Just Cue [itt \Psis [use Pads 


< fsa? (l1+s)ds + [irstovtoas 
0 


where we again used 2ab < a? + b? in the last step. 
First let |m| > 1. By 1+ s < 2 < 2m?/s, it is 


1 


1 
2 
rlim(r)? <2 f  lfm(s)Psds + f [fm(s)l?sds, 
0 0 


and thus by integration 


1 1 
m2 
[riimiryPar <2 f ine + f(s) sds. (A) 
0 0 
We finally consider fo. It is fo(r) = — f- f6(s) ds because fo(1) = 0, thus 


rifo(r)2 < an fie )Prds < [se \Psds < [ive )Psds. 


Therefore, (A.43) also holds for m = 0. Summation with respect to m yields 
the assertion. 


Remark: Therefore, f +> ||Vf||z2(8) defines an equivalent norm to || - || 71(2) 
in H}(B). Indeed, for f € H3(B) it holds by Friedrich’s inequality: 


Ifllinwy = WF llzeay) + IVS ilze@) S 3IIVSllzeq), 
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thus 
1 
B flew) < WIV Flee) < Wfllaxqy for all f € H3(B). (A.44) 


So far, we considered spaces of complex-valued functions. The spaces of real- 
valued functions are closed subspaces. In the Fourier representation, one has 


(ED = Snes’ = ewe! SE) = >. tae 


meZ meZ meZ 


because f(r,y) = f(r,yv). Therefore, fem = fm for all m. All of the theorems 
remain valid also for Sobolev spaces of real-valued functions. 


A.6 Spectral Theory for Compact Operators in 
Hilbert Spaces 


Definition A.51 (Spectrum) 

Let X be a normed space and A: X —+ X be a linear operator. The spectrum 
o(A) is defined as the set of (complex) numbers such that the operator A— AI 
does not have a bounded inverse on X. Here, I denotes the identity on X. 
 € o(A) is called an eigenvalue of A if A — XI is not one-to-one. If A is an 
eigenvalue, then the nontrivial elements x of the kernel N(A— AI) ={x EX: 
Ax — 4x = 0} are called eigenvectors of A. 


This definition makes sense for arbitrary linear operators in normed spaces. 
For noncompact operators A, it is possible that even for 4 4 0 the operator 
A — XI is one-to-one but fails to be bijective. As an example, we consider 
X = @ and define A by 


0, if k=1, 
ens = Lak, if k>2, 


for x = (az) € €?. Then \ = 1 belongs to the spectrum of A but is not an 
eigenvalue of A, see Problem A.4. 


Theorem A.52 Let A: X > X be a linear bounded operator. 


(a) Let x; © X, j =1,...,n, be a finite set of eigenvectors corresponding to 
pairwise different eigenvalues A, € C. Then {x1,...,%n} are linearly inde- 
pendent. If X is a Hilbert space and A is self-adjoint (that is, A* = A), 
then all etgenvalues A; are real-valued and the corresponding eigenvectors 
1,---,Ln are pairwise orthogonal. 


(b) Let X be a Hilbert space and A: X > X be self-adjoint. Then 
|Allcyx) = sup |(Az,2)x| = r(A), 


\lz||x=1 


where r(A) = sup{|A| : A € a(A)} és called the spectral radius of A. 
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The situation is simpler for compact operators. We collect the most impor- 
tant results in the following fundamental theorem. 


Theorem A.53 (Spectral Theorem for Compact Self-Adjoint Operators) 
Let K : X + X be compact and self-adjoint (and K #0). Then the following 
holds: 


(a) The spectrum consists only of eigenvalues and possibly 0. Every eigenvalue 
of K is real-valued. K has at least one but at most a countable number of 
eigenvalues with 0 as the only possible accumulation point. 


(b) For every eigenvalue A # 0, there exist only finitely many linearly indepen- 
dent eigenvectors; that is, the eigenspaces are finite-dimensional. Eigen- 
vectors corresponding to different eigenvalues are orthogonal. 


(c) We order the eigenvalues in the form 


[Aa] = |Aral = |Asl = --- 


and denote by P; : X — N(K — Aj;1) the orthogonal projection onto 
the eigenspace corresponding to \;. If there exist only a finite number 
A1,---,Am of eigenvalues, then 


K = Ses 
j=l 


If there exists an infinite sequence (A;) of eigenvalues, then 


co 
ko= VA 
j=l 
where the series converges in the operator norm. Furthermore, 
m 
Jk-owe, So Neils 
j=l L£(X) 


(d) Let H be the linear span of all of the eigenvectors corresponding to the 
eigenvalues A; #0 of kK. Then 


X = closure(H) @ N(K). 


Sometimes, part (d) is formulated differently. For a common treatment of 
the cases of finitely and infinitely many eigenvalues, we introduce the index set 
J CN, where J is finite in the first case and J = N in the second case. For 
every eigenvalue A;, 7 € J, we choose an orthonormal basis of the corresponding 
eigenspace V(A — Aj; I). Again, let the eigenvalues A; 4 0 be ordered in the 
form 

Ai] > |Ag| > As] >... > 0. 
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By counting every A; 4 0 relative to its multiplicity, we can assign an eigenvector 
x; to every eigenvalue \;. Then every x € X possesses an abstract Fourier 
expansion of the form 


z= wo + S (2,05) x 3; 
jeJ 
for some tp € N(K) and 
Kr = 4 (285) xs 
jeJ 


As a corollary, we observe that the set {x; : 7 € J} of all eigenvectors forms a 
complete system in X if K is one-to-one. 


The eigenvalues can be expressed by Courant’s max-inf and min-sup princi- 
ple. We need it in the following form. 


Theorem A.54 Let K : X + X be compact and self-adjoint (and K #0) and 
let {Aj :j =1,...,n_} and {AF :j =1,...,n4} be its negative and positive 
eigenvalues, respectively, ordered as 


AT Sag SS Ap See OS SAP Sw RASA 


and counted according to multiplicity. Here n+ can be zero (if no positive or 
negative, respectively, eigenvalues occur), finite, or infinity. 


(a) If there exist positive eigenvalues X*, then 


At = min sup (Ka, 2) x 

- VEX 1 |\z\l3 
dim Vam—-17EV x 

(b) If there exist negative eigenvalues 7, then 
K 

M, = max inf ( @,2)x 
sine sev WAR 
1m 


Proof: We only prove part (b) because this is needed in the proof of Theo- 
rem 7.52. Set 


c (Ka, 2)x 
bm = sup inf ——,— 
vex cevt  |la||% 
dim V=m-1 


for abbreviation. For any « € X we use the representation as « = xr + 
a cj 05 + 3, cpa; with Karo =0. es uF — we ec amigslt epee 
ing to AF. Then (K2,z)x = AF IGG P+ ATI = LAGI I: 
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For V = span{x; :j =1,...,m—1} and x € V+ we have that Gg = 
(x, 2; )x =0 for 7 =1,...,m—1 and thus 


(Ka, 2)x = 2 Pp Ag 1G 2 
zeVt all we |x +O, leFP +d, lef? ~  ™ 


because AF <0. Therefore, tim > »;,. We note that for this choice of V the 
infimum is attained for x = z,,. 

Let, on the other hand, V C X be an arbitrary subspace of dimension m—1. 
Choose a basis {vj : j = 1,...,m—1} of V. Construct @ € span{z; :j = 
1,...,m} such that @ L V and ||%||x = 1. This is possible. Indeed, the ansatz 
& = 1 72; leads to the system )05"., ¢j(2j ,ve)x = 0 for £=1,...,.m—1. 
This system of m—1 equations and m variables has a nontrivial solution which 
can be normalized such that ||@||5. = 305", |cj|? = 1. Therefore, 


m m 
inf a «S (K8,8)x = SDjla)? < And la? = An- 
x j=1 j=l 


This shows Um < 4; and ends the proof. 


The following corollary is helpful. 


Corollary A.55 Let K : X — X and AF as in the previous theorem. 


(a) If there exists a subspace W Cc X of dimension m with (Ka, x)x +||a||% 
0 for all x € W then (at least) m negative eigenvalues exist and A, 
arn <=A,,< =I. 


= 
< 


(b) The eigenvalues A; depend continuously on K in the operator norm. More 
precisely’, if eigenvalues Ay <---A;, <0 of K exist and if S: X + X 
is also compact and self-adjoint with sufficiently small ||S — K|| cx) then 
also eigenvalues wy <--+ My, <0 of S exist and |w; —A;| <||S—K]lecx) 
for all j =1,...,m. 


Proof: (a) Let V Cc X be an arbitrary subspace of dimension m—1. We 
construct w € W with ||w||x = 1 such that w L V as in the proof of the 
previous theorem. Then 


aa S (Ku,w)x < —llwllk = -1. 


Since this holds for every such subspace V, we conclude from the previous 
theorem that A;, < —1. 


1formulated for the negative eigenvalues 
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(b) Assume that some negative eigenvalue A; <--- < »,, < 0 for K exist. 
Then, for any 7 € {1,...,m} and any subspace V C X of dimension j — 1, 


—K K 
ie (St, 2)x — ant [0 )@@)x ( @,@)x 
veV+ |[2I|x weVt Ill |x| 
_, (Ka,2)x 
< —_K f 
= 8K ]lea) + ant i712 
= NS llega: Ay 


which is negative for sufficiently small ||S — K||c(x). Therefore, S has negative 
eigenvalues as well and, by taking the supremum, py < A; +||S—K||c(x). Now 
we can interchange the roles of S and K which yields |u; —Aj; | < ||S— K||ccx)- 


The spectral theorem for compact self-adjoint operators has an extension to 
non-self-adjoint operators K : X + Y. First, we have the following definition. 


Definition A.56 (Singular Values) 

Let X and Y be Hilbert spaces and K : X — Y be a compact operator with 
adjoint operator K* : Y — X. The square roots fi; = Jj; j € J, of the 
eigenvalues A; of the self-adjoint operator kK*k : X — X are called singular 
values of K. Here again, J CN could be either finite or J=N. 


Note that every eigenvalue \ of K* kK is non-negative because K* ka = Xx 
implies that A(v,2)x = (K*Kua,2)x = (Ka, Kx)y > 0; that is, A > 0. 


Theorem A.57 (Singular Value Decomposition) 

Let kK : X —> Y be a linear compact operator, K* : Y —>+ X its adjoint 
operator, and p11 > U2 > 3... > 0 the ordered sequence of the positive singular 
values of K, counted relative to its multiplicity. Then there exist orthonormal 
systems {a :j € J} CX and {y;:j € J} CY with the following properties: 


Kae Sopeys ond Koy = ppay for alle J 


The system {5,2;,y; :j € J} ts called a singular system for K. Every « © X 
possesses the singular value decomposition 


v= amt S (a 0y) x 2; 
jet 


for some xo € N(K) and 


Kr = So 4 (@,Xj)x Yj - 
jet 


Note that J CN can be finite or infinite; that is, J=N. 
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The following theorem characterizes the range of a compact operator with 
the help of a singular system. 


Theorem A.58 (Picard) 
Let K : X —+Y be a linear compact operator with singular system {[;,%5, Yj: 
7 € J}. The equation 


Ku =y (A.45) 
is solvable if and only if 
yEN(K and La (yy ylo < co. (A.46) 
jer 


In this case 1 
t= So = (us)y 2; 
jer I 
is a solution of (A.45). 


We note that the solvability conditions (A.46) require a fast decay of the 
Fourier coefficients of y with respect to the orthonormal system {y; :j € J} in 
order for the series 

co 


to converge. Of course, this condition is only necessary for the important case 
where there exist infinitely many singular values. As a simple example, we study 
the following integral operator. 


I(y; Ys)Y 


Sl> 


Example A.59 
Let K : L?(0,1) —+ L?(0,1) be defined by 


(K2x)(t) := [rae t€ (0,1),  € L£7(0,1). 
Then 
(K*y)(t) = ico ds and (K*Kzx)(t) = [(f« ar) ds. 
t 0 


The eigenvalue problem K* Ka = Az is equivalent to 


_ [([ooe)as t€ [0,1]. 
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Differentiating twice, we observe that for A 4 0 this is equivalent to the eigen- 
value problem 


Ax” +x2=0in (0,1), «(1) = v'(0) = 0. 
Solving this yields 


2 2-1 4 
x; (t) => [2 cos 5) Tt, te (0, 1], and Aj => Qj —12n 
for 7 € N. The singular values yz; and the ONS {y; : 7 € N} are given by 
2 


Mj = re Vee and 


2 273-1 
y(t) = 4/—sin : mt, GEN. 
T 2 


The singular value decomposition of Theorem A.57 makes it possible to 
define, for every continuous function f : [0, ||K lec vy) — R, the operator 
f(K*K) from X into itself by 


f(K*K)x = S> fu?) (@,0;)xaj;, DEX. (A.47) 
JET 


This operator is always well-defined, linear, and bounded. It is compact if, 
and only if, f(0) = 0 (see Problem A.5). The special cases f(t) = t and 
f(t) = vé are of particular importance. From this definition we note that 
R(VK*K) = R(K*). 

For the operator K of the previous Example A.59 we note that 2 € 
R((K*K)?/?) if, and only if, 


co 
i ress (Qj — 1) |cj|?_ < co where cj; =4/— ac re 


saa | 


— dt 


are the Fourier coefficients ot az. Therefore, R((K*K yr ?) plays the role of the 
periodic Sobolev spaces H°,,.(0,27) of Section A.4. 


ner 


A.7 The Fréchet Derivative 


In this section, we briefly recall some of the most important results for nonlinear 
mappings between normed spaces. The notions of continuity and differentiabil- 
ity carry over in a very natural way. 


Definition A.60 Let X and Y be normed spaces over the field K = R or C, 
Uc X an open subset, & EU, andT: X DU SY be a (possibly nonlinear) 


mapping. 
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(a) T is called continuous at & if for every € > 0 there exists 6 > 0 such that 
||T(a) — T(2)|ly <e for alla €U with ||la —2||x <6. 


(b) T is called Fréchet differentiable at & € U if there exists a linear bounded 
operator A: X + Y (depending on &) such that 


Jim Type IP +8) — T(@) — Ally = 0, (A.48) 


We write T'(&) := A. In particular, T'(@) € L(X,Y). 


(c) The mapping T is called continuously Fréchet differentiable at ¢ € U if 
T is Fréchet differentiable in a neighborhood V of & and the mapping 
T’:V SL(X,Y) is continuous in &. 


Continuity and differentiability of a mapping depend on the norms in X 
and Y, in contrast to the finite-dimensional case. If T is differentiable in 2, 
then the linear bounded mapping A in part (b) of Definition A.60 is unique. 
Therefore, T’(#) := A is well-defined. If T is differentiable in x, then T is also 
continuous in x. In the finite-dimensional case X = K” and Y = K”, the linear 
bounded mapping T’(«) is given by the Jacobian (with respect to the Cartesian 
coordinates). 


Example A.61 (Integral Operator) 
Let f : [c,d] x [a,b] x K- K, f = f(t,s,r), K = R or K=C, be continuous 
with respect to all arguments and also Of /Or € C((c, d] x [a,b] x R). 


(a) Let the mapping T’ : Cla, b] > C|c, d| be defined by 
b 
T(x)(t) := fc s,x(s))ds, t€ [c,d], c€ Clad). (A.49) 


We equip the normed spaces C|c, d] and C[a, 6] with the maximum norm. Then 
T is continuously Fréchet differentiable with derivative 


b 
(Pw aie) = / F(t, s,x(s)) z(s)ds, té€[e,d], x,z€C[a,0]. 


Indeed, let (Az)(t) be the term on the right-hand side. By assumption, Of /Or 
is continuous on [a,b] x [c,d] x R, thus uniformly continuous on the compact 
set M = {(t,s,r) € [a,b] x [c,d] x K: |r| < |z|| +1}. Let ¢ > 0. Choose 
6 € (0,1) with |2£(t,s,r) — 2£(t,s,7)| < ;& for all (t,s,r), (t,8,7) € M with 
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|r —7| <6. We estimate for z € Cla, b] with ||z||o. < 6: 


Of Of 
By (tes a(s) + rz(s)) — Op (t, s,(s))] |2(s)| dr ds 


IA 
o Bg 8 
OS = 


rat 
cas 


E 
cop — ds = ellz|la. 
b-—a 


This holds for all t € [c,d], thus also for the maximum. Therefore, 


(a + 2) — T(x) ~ A2lloo 


IIIc 


< e forall z € Cla,}] with ||z||. <6. 


This proves the differentiability of T from C[a, b] into C[c, d]. 


(b) Let now in addition Of/Or € C([a,b] x [c,d] x K) be Lipschitz continuous 
with respect to r; that is, there exists « > 0 with |Of(t,s,r)/Or — Of(t, s, 
&|r —?| for all (t, 5,7), (t, 8,7) € [a, b] x [c,d] x K. Then the operator T of (A.49) 
is also Fréchet-differentiable as an operator from L?(a,b) into C{c, d] (and thus 
into L*(c,d)) with the same representation of the derivative. 


Indeed, define again the operator A as in part (a). Then A is bounded because 
b 
of 
(Az)@)] Ss Bp (ts 8 al 


Ie 0 i oe 
[ 


ee ue NOP -,0)/Or||22(a,b) + K||x|122(a,0) Il2ll22 (a,b) 


)) lz(s)| ds 


IA 


IA 


for all t € [c, d]. 


Concerning the computation of the derivative we can proceed in the same way 
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as in part (a): 
|T(@ + z)@) — T(z) — (Ax) 


|z(s)| dr ds 


3 


[|Zeoses + r2(s)) = oF, s, x(s)) 
0 


b 
K K 


This holds for all t € [c,d]. Therefore, ||T(2 + z) — T(x) — Azllo. < 5 lIzlZ2¢a,0) 
which proves the differentiability. : 


The following theorem collects further properties of the Fréchet derivative. 


Theorem A.62 (a) LetT,S: X DU >Y be Fréchet differentiable at x € U. 
Then T+ S and XAT are also Fréchet differentiable for all X € K and 


(T+S)(c) = Te) +S"), (AT) (a) = AT). 


(b) Chain rule: Let T: X DU 4V CY andS:Y DV — Z be Fréchet 
differentiable at « € U and T(x) € V, respectively. Then ST is also 
Fréchet differentiable at x and 


(ST) (z) = S'(T(x)) T'(z) € L(X,Z). 
—SES—S— 
EL(Y,Z) EL(X,Y) 


(c) Special case: If &,h € X are fixed and T : X > Y is Fréchet differ- 
entiable on X, then w : K + Y, defined by W(t) := T(a+ th), t € K, 
is differentiable on K and y(t) = T’(@+th)h € Y. Note that origi- 
nally w(t) € LIK,Y). In this case, one identifies the linear mapping 
y(t): KY with its generating element y(t) € Y. 


If T’ is Lipschitz continuous then the following estimate holds. 
Lemma A.63 Let T be differentiable in the ball B(Z, p) centered at F € U with 


radius p > 0, and let there exists y > 0 with ||T'(x) —T"(®)||ccx,v) < y||e&-Zl|x 
for all x € B(&,p). Then 


T(x) -T@)-T'@(«x-DI|, < F lle — all for all « € B(Z,p). 
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Proof: Let € € Y* and x € B(Z,p) kept fixed. Set h = x — and define the 
scalar function f(t) = (¢,T(£+ th)) for |t| < p/||hl|x where (-,-) = (-,-)y*y 
denotes the dual pairing in (Y*,Y). Then, by the chain rule of the previous 
Theorem, f(t) = (¢,T’(@+ th)h) and thus 
\(@, T(x) — T(z) — T’(z)h)| 
= |(¢,T(2)) — (€,T@)) — (¢,T'(@)h)| 


= |fQ)-f0)-@T'@a)| = [ wo-er@mae 


é i [(e, "(E+ th)h) — (¢,T'(Z)h)] a 


Z | ‘(e, [0'(e+ th) — T'(@)]h) at 


IA 


I¢| 


ab 
vlc f IEE +A) — Tle t 


1 
< vldly-Wal% fo ede = Zhe 
0 


We set y := T(x) — T(x) — T’(z)h and choose @ € Y* with ||é||y» = 1 and 
(€,y) = |ly|ly. This is possible by a well known consequence of the Hahn-Banach 
theorem (see [151], Chap.V, §7, Theorem 2). Then the assertion follows. 


x 


hil - 


Ye y* 


We recall Banach’s contraction mapping principle (compare with Theo- 
rem A.31 for the linear case). 


Theorem A.64 (Contraction Mapping Principle) 
Let C C X be a closed subset of the Banach space X andT: X DC > X a 
(nonlinear) mapping with the properties 


(a) T maps C into itself; that is, T(x) € C for alla € C, and 
(b) T is a contraction on C; that is, there exists c< 1 with 


|T(#) -—Tiy)|lx < ella—yllx foralla,yec. (A.50) 


Then there exists a unique & € C with T(z) = &. The sequence (a¢) in C, 
defined by xe41 := T(ae), €=0,1,... converges to & for every uo € C. Further- 
more, the following error estimates hold: 


|ve41 —Z\|x < cllag-Zllx, €=0,1,...; (A.51a) 
that is, the sequence converges linearly to z, 


ce 


IA 


lag — &|| x ||Iva1 —xollx, (a priori estimate) (A.51b) 


l—-e¢ 


IA 


|ze — &||x |ve41 — 2e\lx, (a posteriori estimate) (A.51c) 


1 
l-c¢ 
for €=1,2,... 
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The Newton method for systems of nonlinear equations has a direct analogy 
for equations of the form T(#) = y, where T : X — Y is a continuously 
Fréchet differentiable mapping between Banach spaces X and Y. We formulate 
a simplified Newton method and prove local linear convergence. It differs from 
the ordinary Newton method not only by replacing the derivative T’(x¢) by 
T’(&) but also by requiring only the existence of a left inverse. 


Theorem A.65 (Simplified Newton Method) 
Let T: X > Y be continuously Fréchet differentiable between Banach spaces 
X and Y. Let V C X be a closed subspace, & € V and gy := T(z) € Y. Let 
L:Y +> V be linear and bounded such that L is a left inverse of T’(%): X > Y 
on V; that is, LT’(#)u = v for allve V. 

Then there exists € > 0 such that for any y = T(Z) with & € X and ||Z — 
&\|x <e the following algorithm converges linearly to some & EV: 


to = %, Let, = Le — L|T (ae) — 9), £= 051,250. . (A.52) 
The limit & € V satisfies L[T(z) — 9| =0. 


Proof: We apply the contraction mapping principle of the preceding theorem 
to the mapping 


S(z) = « — L{T(«)-9) = L[T'(@e-T(z)+T@] 


from V into itself on some closed ball B[%,p] C V. We estimate 


x 


I|S(@) — S(z)\lx S [Lllew,x|IT"(@)(@ - z) + T(z) - TO) lly 
Lllec.xy lle - allx{IIT"(@) —T'(2)llevx,y) 


_ Ie) =T@) +7 We - ally \ 
| lla — 2\|x 


IA 


and 


I|S(x) — & Ix 


IN IA 


First, we choose p > 0 such that 


|T(@) = Te) + T'2)(@ = 2)lly 


lla — 2||x 


‘ 1 
Lllecy,x) JIT"(@) — T’(2)llecx,y) + a2, 
for all xz, z € B[%, p]. This is possible because T is continuously differentiable. 


Next, we choose € > 0 such that 


as) 


ZlewnllT@ -T@lly < 5 
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for ||Z — &||x <e¢. Then we conclude that 


1 
Sz) — SG@)Ilx <5 lle—zllx for all 2,2 € Ble, 9], 


1 1 
lc —||x 4 a p for alla € B[z, p]. 


I|S(x) — &|l x 


IA 


Application of the contraction mapping principle ends the proof. 


The notion of partial derivatives of mappings T': X x Z > Y is introduced 
just as for functions of two scalar variables as the Fréchet derivative of the 
mappings T(-,z): X > Y for z € Z and T(a,-): ZY for x € X. We denote 
the partial derivatives in (a, z) € X x Z by 


Bers 3) EL(X,Y) and See EL(Z,Y). 
Ox Oz 
Theorem A.66 (Implicit Function Theorem) 
LetT: XxZ—- Y be continuously Fréchet differentiable with partial derivatives 
27 (a, 2) € L(X,Y) and 27 (a, z) € L(Z,Y). Furthermore, let T(z, 2) = 0 
and 2T(é, 2) :Z—> Y be a norm-isomorphism from Z onto Y. Then there 
exists a neighborhood U of & and a Fréchet differentiable function W:U > Z 
such that W(z) = 2 and T(z, ¥(2)) = 0 for allx € U. The Fréchet derivative 
w' € L(X, Z) is given by 
wv (2) =- rte 7) barge w(z)), xEU 
Oz , Ox ‘ ; 


The following special case is particularly important. a 
Let Z = Y = K; thus T: X x K > K and T(@, \) = 0 and AT(&,A) £0. 
Then there exists a neighborhood U of & and a Fréchet differentiable function 
w:U + Ksuch that ~(@) = \ and T(z, ¥(x)) = 0 for all x € U and 
1 0 


AT (a, H(z) Ox 


where again X* denotes the dual space of X. 


Y(«) = 


T(z, p(z)) €£L(X,K)=X*, ceu, 
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Definition A.67 Let X be a normed space. 


(a) A set M C X is called convex if for all x,y € M and all X € [0,1] also 
At+(1—-A)ye M. 


(b) Let M Cc X be convex. A function f : M > R is called convex if for all 
x,y €M and all € [0,1] 


fAe+(1—A)jy) < Af(x) + A-A) f(y): 
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(c) A function f : M — R is called concave if —f is convex; that is, if for all 
x,y €M andallr€ [0,1] 


fAw+(1—-A)y) = Af(z) + CL—-A) FU): 
(d) f is called strictly convex if 


f(Ae+(1—Ajy) < Af(z) + (1-A) fly) 


x,y € M and all X © (0,1) with « # y. The definition for a strictly 
concave function is formulated analogously. 


The definition of convexity of a set or a function can be extended to more 
than two elements. 


Lemma A.68 Let X be a normed space. 


(a) A set MC X is conver if, and only if, for any elements z; © M, j = 
1,...,m, and 5 > 0 with ae Aj = 1 also the convex combination 
ar A;x; belongs to M. 


(b) Let MC X be convex. A function f : M > R is convex if, and only if, 


for alla; € M,j=1,...,m, and A; > 0 with He Aj = 1. For concave 
functions the characterization holds analogously. 


For any set A C X of a normed space A the set 


conv A = Aja; ia; € A, A; > 0, A; =1, meEN (A.53) 
jaz + Oy j j 
j=l 


j=l 
is convex by the previous lemma and is called the convex hull of A 


The following separation theorem is one of the most important tools in the area 
of convex analysis. 


Theorem A.69 Let X be a normed space over R and A,B C X two convex sets 
with AN B=. Furthermore, let A be open. Then there exists a hyperplane 
which separates A and 6; that is, there exists €€ X* andy € R such that £4 0 
and 


(C,a)x«x > 7 > (¢,b)x*x forallacAandbeB. 


Here, (€,%)x+,x denotes the dual pairing; that is, the application of € € X* to 
rex. 
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For a proof we refer to, e.g., [139], Chapter II. The hyperplane is given by 
{ee X: (€,2)x+ x = 7}. 
The convexity can be characterized easily for differentiable functions. 


Lemma A.70 Let X be a normed space and M C X be an open convex set and 
f:M—R be Fréchet differentiable on M. Then f is convex if, and only if, 


fy) -—f(z)-f@y-z) = 0 foraliz,yeM. 


f is strictly convex if, and only if, the inequality holds strictly for all x F y. 
For concave functions the characterizations hold analogously. 


Proof: Let first f be convex, z,y € M and t € (0,1). From the convexity of f 
we conclude that f(x +¢t(y—2)) < f(x) +t[f(y) — f(y)], thus 


fy) -f(x) > <[f(e+t(y--2)) - f(x)] 


[f(a + t(y — 2)) — f(x) -—tf'(x)\(y—=2)] + f(y =a).. 


el Ral eR 


The first term on the right-hand side tends to zero as t + 0. This proves 
f(y) — f(z) = f'(z)(y— 2). 

Let now f(u) — f(v) > f’(v)(u—v) for all u,v € M. For x,y € M and X€ [0,1] 
apply this twice to v = Av + (1 — A)y and u = y and u = 2g, respectively. With 
y—v=—X(a—y) and x — v= (1—A)(a — y) this yields 


f(y) — fF) fey —v) =Af'(v)(y— 2), 
f(x) — f(r) f'(v)(@— v) = (1-A)f/(e)(e— y). 


Multiplying the first inequality by 1 — A and the second by A and adding these 
inequalities yields the assertion. 


IV IV 


Note that we could equally well write (f’(x),y — x) x» x for f’(x)(y — 2). 
We use both notations synonymously. 

This characterization motivates the definition of the subgradient of a convex 
function. 


Definition A.71 Let X be a normed space over R with dual X*, M Cc X be 
an open conver set, and f : M—-R be a convex function. For x € M the set 


Of(z) i= {LE X*: f(z) — f(x) — (6,2-2)x+,x 20 for allze M} 
is called the subgradient of f at x. 


As one sees from the function f(a) = |x| for  € R the subgradient Of is, in 
general, a multivalued function. It is not empty for continuous functions, and 
it consists of the derivative as the only element for differentiable functions. 
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Lemma A.72 Let X be a normed space over R with dual X*, M Cc X be an 
open convex set, and f : M — R be a convex and continuous function. 


(a) Then Of (x) #0 for alla Ee M. 


(b) If f is Fréchet differentiable at x € M then Of(x) = {f'(x)}. In particu- 
lar, Of is single valued at x. 


Proof: (a) Define the set D Cc X x R by 
D = {(z,r)€ MxR:r> f(z)}. 


Then D is open because M is open and f is continuous and D is also convex 
(see Problem A.6). Fix « € M. Then we observe that (2, f(x)) ¢ D. The 
separation theorem for convex sets (see Theorem A.69) yields the existence of 
(€,s) € X* x R with (¢,s) 4 (0,0) and y € R such that 


(€,z)x*x + sr < y < (€,")x«x + sf(x) forallr> f(z), zEeM. 


Letting r tend to +coo implies that s < 0. Also s # 0 because otherwise 
(€,2z)x* x < (0,2) x« x for all z € M which would imply that also @ vanishes”, a 
contradiction to (¢,s) 4 (0,0). Therefore, s < 0 and, without loss of generality 
(division by |s|), we can assume that s = —1. Letting r tend to f(z) yields 
(0, z)x«,.x — f(z) < (6,2) x« x — f(x) which shows that ¢ € Of(z). 

(b) f’(a) € Of (a) follows from Lemma A.70. Let @ € Of(a) and y € X arbitrary 
with y #0. For sufficiently small t > 0 we have that «+ ty € M and thus 


fermi Fa)=t ey 2 1e=7 ae. 


Division by t > 0 and letting ¢ tend to zero implies that the left-hand side tends 
to zero by the definition of the derivative. Therefore, (¢ - f@)\y < 0. Since 
this holds for all y € X we conclude that ¢= f’(z). 


Lemma A.73 Let 7) : [0,a] > R be a continuous, concave, and monotonically 
increasing function with (0) = 0. 


(a) Then ~(st) < max{1,s} a(t) for all s,t > 0 with st,t € [0,a). 
(b) The function t > [u( vey]? is concave on [0, a]. 


(c) Let K : X + Y be a linear compact operator between Hilbert spaces and 
a> ||K|lccx,vy)- Then 


w(K" K)/?)z||, < b(|Kzlly) for all z € X with |lz|_x <1. 


2The reader should verify this himself by using that M is an open set. 
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Proof: (a) If s < 1 the assertion follows from the monotonicity of w. If s > 1 
then 


we = v (2+ (1-2)0) > twee + (2-2) 0@ = LoG0. 


(b) Set g(t) = [w( vey]? for t € [0,a?]. If 2 was twice differentiable then an 
elementary calculation shows that 


ary = 29 [viva — wd) + Luwy'vi. 


The first term is non-positive because ~’ > 0 and sw’(s)—w(s) = ¥(0)—v(s)— 
(0 — s)w’(s) <0 by Lemma A.70. The second term is also non-positive because 
wv >Oand wv” <0. Therefore, 6’(t) < 0 for all t which proves that ¢ is concave. 
If & is merely continuous then we approximate w by a sequence (wz) of smooth 
concave and monotonically increasing functions with %,(0) = 0 and ||w, — 
V|lo. > 0 as @ tends to infinity. We sketch the proof but leave the details 
to the reader. 

In the first step we approximate w by the interpolating polygonal function p,, 
with respect to t; = 72, 7 = 0,...,m, with values wy; := ~(t;) at t;. Then, 
for any ¢ € N there exists m = m(£) € N with ||pm — Ilo. < 1/¢. Next, we 
extend p,, onto R by extending the first and the last segment linearly; that is, 
by setting pm(t) = v1 a for t < 0 and p,,(t) = w(a) 4 Hla) vmn ot (t — a) for 
t> a. 

In the third step we smooth p,, by using a mollifier; that is, a non-negative 
function @ € C®(R) with ¢(t) = 0 for |t| > 1 and f', o(#)dt = 1. We set 


p(t) =+6() and 


Pp(t) = J e0lt— 5) Pm(s) ds — J e0(8)Pm(t~ 5) ds, t € [0,a]. 


Then Wp is in C™(0, a], concave, monotonically increasing and IIb, —Pm||cjo,a) > 
0 as p tends to zero. Finally we set w(t) = W(t) — b,(0), where p = p(£) > 0is 
such that ||~e—Pmllo(o,a) < 1/é. This sketches the construction of the sequence 
We. 

From the first part we know that t + [we(v2)| * is concave for every @. Letting 
é tend to infinity proves the assertion. 
(c) We set again ¢(t) = [b(Vvé)]? for t € (0, IKllzx,¥] and let z € X with 
\|z\|x <1. We decompose z in the form z = z + z+ with z € N(K) and z+ 1 
N(K). Then p((K*K)'/?)z = o((K*K)'/?)z+ and Kz = Kz+. Therefore, it 
suffices to take z € X with z L N(K) and ||z||x < 1. With a singular system 
{Hj,%3,Yj +7 © J} of K we expand such az € X as z=). 2j2;. We set 
Jn = J if J is finite and J, = {1,...,n} if J =N and 2 = Der 2a, with 
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2; = 25/4] Dees, |2el?. Then ||2 | = yey, |2j|? = 1 and thus 
WKY 2, = So [oud]? = SO 605) 12P 


JEIn JEIn 


6D Bl?) 2 lal dS ee 
jEdIn jEdIn 
2 
= [vce |y)] 


where we used that ¢ is concave. Letting n tend to infinity yields 2° > 2 = 
z/||z||x and thus Ib ((K*K) 1/2) 2 ge w(||K4|ly). Therefore, 


2 


Io (K*K)/) al], llzllx ||V(K*K)?)2 
< |ellx ¥(|IKélly) = lel 6 (7 7 I-ly) 
v(||Kzlly) 


where we used the estimate ~(st) < s¢(t) from part (a) for s = 1/||z||x >1 
and t = ||Kz|ly. 


x 


IA 


Lemma A.74 There exist constants cy > 0 and cp > 0 with c = 0 for p < 2 
and Cy > 0 for p > 2 such that for allz €R 


cule? fp <2 or|a|>H, 
Cplz|? < |l+2)P—l—pz < (A.54) 
cy [2  ifp >2 and |2z| <4. 
Proof: To show the upper estimate, let first |z| > 4. Then 1 < 2\z| and thus 
|1+2|?—1—pz < (1+z2|)?+p-1-|z| < (3]z|)?+p2?7"|z|?"|2| = (8° +p2?1)\z/P. 


Let now |z| < 4; that is, 1 > 2|z|. Then |1+z|?—1—pz = (1+z)?-1- my = f(z). 
We compute Pilz = pi(l+2)?- 1_ 4] and f’(z) = p(p—1)(1+z)?-?. Taylor’s 
formula yields 


(L+2)?—1—pe = fle) = FO+F/O24+5 "02 = MP arerre 


for some € with |&| < |z]. 

Let first p > 2. Then (1 +€)?~? < (3/2)?~?, and the estimate is shown. 

Let now p <2. Then (1+ £)?"? = qpgp=p S peep = lel?" because 1+ € > 
1—|€| > 2|z| —|z| = |z|. Therefore, f(z) < "®@>¥ |zjp-22?2 = PREY |z)”, and 
the upper estimate of (A.54) is shown. 


To show the lower estimate, we choose c,€(0, 1) for p>2 such that e?~1)/(@~?) 


Cp + (1 — ge PP > 0 and set cp = 0 for p < 2. We distinguish between 
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several cases. 

Case A: z > 0. Set f(z) := |l+z2/?-—1—pz—ep|z|? = (14+ 2)? —1—pz—cpz? for 
z>0. Then f(0) =0 and f’(z) = p[(1+2)?-!—1—c2?~1] and f’(0) = 0 and 
f(z) = p(p — 1) [(1 + 2)? — ez? > p(p — 1) [2z?-? — epz”-?] > 0 because 
Cp <1. Therefore, f’ is monotonically increasing, thus positive. Therefore, f is 
monotonically increasing, thus positive. 


Case B: z < 0. We replace z by —z > 0 and have to show that f(z) := 
\L= 2 1+ p2e,2" 2 0 for all.2 > 0, 


Case B1: 0< 2< 1. Then f(z) = (1-2)? —1+4+ pz —cpz” and f(0) =0 and 
f(z) = p[1— (1 — 2)? = cpz?“1] and f’(0) =0 and f’(1) = p(1— cp) > 0. 
If p < 2 then c, = 0 and thus f’ > 0, thus f > 0 on (0, 1]. 

Let p > 2. Then f(z) = p(p — 1)[(1 — z)?-? — ep2?-?] and f”(0) > 0 and 
f" (1) < 0. Since f’” < 0 on (0,1) there exists exactly one zero 2 € (0,1) of f”. 
Therefore, f’ increases on [0, 2] and decreases on [2,1]. Therefore, the minimal 
values of f’ on [0, 1] are obtained for z = 0 or z = 1 which are both non-negative. 
Therefore, f’ > 0 on [0,1] which implies that also f is non-negative. 


Case B2: z > 1. Then f(z) = (z-1)?-14+ pz—cp2” and f(1) = -1+p—c >0 
and f’(z) = p[(z—1)?-!+1— e271] and f’(1) = p(1— cp) > 0. 

If p < 2 we conclude that f” > 0 on [1,co), thus f’ > 0 on [1,00) and thus also 
f > 0 on [1,). 

Let finally p > 2. Then f(z) = p(p— 1)[(z — 1)?~? — cpz?-?] and f”(1) < 0 
and f(z) —+ co as z + co. The second derivative f” vanishes at 2 = [1 — 
Ve 


cpl > 1, which is negative on [1, 2) and positive for z > %. Therefore, 
f'(z) decreases on (1,2) and increases for z > 2. Its minimal value is attained 
at Z which is computed as 


f'@ = pl@-)? *-gq 14+] 


i = 7 _2)\p-1 
~ 1G <e/ eye [oP D/P) — op + (1— eh”)? > 0 


by the choice of c,. Therefore, f’ is positive for z > 1 and thus also f. 
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In this subsection, we recall the basics on weak topologies. We avoid the defini- 
tion of the topology itself; that is, the definition of the family of open sets, but 
restrict ourselves to the concept of weak convergence (or weak* convergence). 


Definition A.75 Let X be a normed space and X* its dual space with dual 
pairing (€,x)x~« x forle X* andxe X. 


(a) A sequence (a) in X is said to converge weakly to xEX if lim (¢,an)x* x = 
noo 


(€,@)x+ x for all€e X*. We write x, — x for the weak convergence. 
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(b) A sequence (£,) in X* is said to converge weakx to ¢ € X* if 
lim (€n,2)x*,x = (¢,0)x+.x for allx € X. We write , — € for the 
nm—->co 
weak* convergence. 


First we note that weak convergence is indeed weaker than norm conver- 
gence. This follows directly from the continuity of the functional € X*. Fur- 
thermore, we note that weak convergence in X* means that (T,n) x*« x* 
(T, €) x«« x~* as nm — oo for all T © X** where now (.,:)x*« x« denotes the 
dual pairing in (X**, X*). By the canonical imbedding X — X** (see for- 
mula (A.14)) weak convergence in X* is stronger than weak* convergence. In 
reflexive spaces (see Definition A.22) the bidual X** can be identified with X 
and, therefore, weak and weak* convergence coincide. If X is a Hilbert space 
with inner product (-,-)x then, by the representation Theorem A.23 of Riesz, 
the dual space X* is identified with X itself, and weak convergence rz, — x is 
equivalent to (ap,z)x — (@,z)x for all z € X. 

We will need the following results which we cite without proof. 


Theorem A.76 Let X be a normed space and X* its dual space and (%p) a 
sequence in X which converges weakly to some x € X. Then the following 
holds: 


(a) The weak limit is well-defined; that is, if 2, +x anda, — y then a = y. 
(b) The sequence (tp) is bounded in norm; that is, ||xp||x is bounded in R. 


(c) If X is finite dimensional then (xp) converges to x in norm. Therefore, 
in finite-dimensional spaces there is no difference between weak and norm 
convergence. 


(d) Let U Cc X be a convex and closed set. Then U is also weakly sequen- 
tially closed; that is, every weak limit point x € X of a weakly convergent 
sequence (Xp) in U also belongs to U. 


(e) Let Y be another normed space and K : X —> Y be a linear bounded 
operator. Then (Ka2n) converges weakly to Ka in Y. If K is compact 
then (Kay) converges in norm to Ka. 


The following theorem of Alaoglu-Bourbaki (see, e.g., [139]) is the essential 
ingredient to assure the existence of global minima of the Tikhonov functional 
for nonlinear inverse problems. 


Theorem A.77 Let X be a Banach space with dual space X*. Then every 
bounded sequence (£,) in X* has a weakx convergent subsequence. 


For Hilbert spaces the theorem takes the following form. 


Corollary A.78 Let X be a Hilbert space. Every bounded sequence (x) in X 
contains a weak accumulation point; that is, a weakly convergent subsequence. 
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A.10 Problems 


A.l 


A.2 
A.3 


A.4 


A.5 


A.6 


(a) Show with Definition A.5 that a set M Cc X is closed if, and only if, 
the limit of every convergent sequence (xx), in M also belongs to M. 


(b) Show that a subspace V C X is dense if, and only if, the orthogonal 
complement is trivial; that is, V+ = {0}. 


Try to prove Theorem A.8. 


Let V Cc X CV’ bea Gelfand triple (see Definition A.26) and 7: Vo X 
and 7’: X <> V’ the corresponding embedding operators. Show that both 
of them are one-to-one with dense range. 


Define the operator A from the sequence space £? into itself by 


(Ax)x ={ 0; :, oe 


tei, if k2>2, 


for x = (xz) € 2. Show that A = 1 is not an eigenvalue of A but J — A 
fails to be surjective. Therefore, 1 belongs to the spectrum of A. 


Let K : X — Y acompact operator between Hilbert spaces X and Y, and 
let f : [0, IK ey] — R be continuous. Define the operator f(K*K) 
from X into itself by (A.47). Show that this operator is well defined 
(that is, f(K*K)« defines an element in X for every x € X), linear, and 
bounded and that it is compact if, and only if, f(0) = 0. 


Let A C X be an open set of a Hilbert space X and f : A > R convex 
and continuous. Show that the set D C X x R, defined by 


D := {(z,r)€ AXR:r> f(z}. 


is convex and open in X x R. 
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Proofs of the Results of 
Section 2.7 


In this appendix, we give the complete proofs of the theorems and lemmas of 
Chapter 2, Section 2.7. For the convenience of the reader, we formulate the 
results again. 


Theorem 2.20 (Fletcher—Reeves) 

Let K : X > Y be a bounded, linear, and injective operator between Hilbert 
spaces X andY. The conjugate gradient method is well-defined and either stops 
or produces sequences (2), (p™) in X with the properties 


(Vile), VIG) x = 0 forall j Am, (2.41a) 


and 
(Kp™,Kp’)y = 0 forall j 4m; (2.416) 


that is, the gradients are orthogonal and the directions p™ are K -conjugate. 
Furthermore, 


(Vf(a?),K*Kp™), = 0. forallj <m. (2.41c) 
Proof: First, we note the following identities: 


(a) Vf(a™t!) = 2K*(Ke™! — y) = 2K*(Ka™ — y) — 2tmK*Kp™ = 
Vi(a™) — 2tmAK*Kp™. 


(2) OS Vie"). = (O™ VIO) 2 = Ml Ko” Key = © by the 
definition of t,,. 


(7) tm = 5(VF(2™), 9”) ¢ Kp" lly? = 2 IVE@™)1K/|Kv" ll} since 
p™ = 5VF(e™) + Ym-1p™* and (6). 


Now we prove the following identities by induction with respect to m: 
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(i) (Vs(@™), VF(a4)) , = 0 for 7 =0,...,m—1, 
(ii) (Kp™, Kp’)y =0 for j =0,...,m—1. 
Let m= 1. Then, using (a), 
(i) = (VF(x"), VF(@")) y= [IV F(a) | — 2to(Kp®, KV f(a), =0, 


which vanishes by (7) since p? = $V f(2®). 
(ii) By the definition of p! and identity (a), we conclude that 


(Kp',Kp)y = (p',K*Kp®)x 


ie 
~ 55 {3 


= -s [ZIV FIR - W.VA) x] = 0, 


Vf (a!) + yop®, VF(a") — VF (2°) x 


where we have used (3), the definition of p°, and the choice of 7. 
Now we assume the validity of (i) and (ii) for m and show it for m+ 1: 


(i) For 7 = 0,...m—1 we conclude that (setting y_; = 0 in the case 7 = 0) 
(Vier) VIG) ke = (VIG) 2 COR Vie). 

—2im (Vi (a), K*Kp™) , 

= —Atm (Kp! —j;-1Kp’*, Kp™), =" 0s 


I 


where we have used $V f(a?) + 7j;-1p’~' = p’ and assertion (ii) for m. 
For 7 = m, we conclude that 
(Vie), VI@™)) « 
= |IVE@™)Ik — 2tm (VE(e™), K*Kp™) » 


= vse" — 5 FES EE (WF), KK") » 


by (7). Now we write 


1 
(Kp™, Kp™)y = (xv (S 0s") +70") ) 
Y 


= (Kp, KVi(a™))y 


which implies that (V f(«t!), Vf(2™)) , vanishes. 


(ii) For 7 = 0,...,m—1, we conclude that, using (a), 
1 
(Kp™*?, Kp))y = (Gren) + amo” KK) 
X 


1 


ag, (VF) VIO) — VE) x 
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which vanishes by (i). 


For j = m by (a) and the definition of p™*!, we have 
m+1 m 1 1 m+1 m m mt+1 
(Kph™ Kp™)y = 5 (VEEN) + mp™ VE(e™) — VEE) 
m xX 
= dV fo"), VIO) ¢ — sIVFe™* YI 
Pa We ; x 2 a 


+ Ym (p™, VE") mle, 7Fe"*2)) «b 
——_ 


=$IIVEC@™ 1% 
1 s, 7 
= a {rllVF@™ I — IVE I} 


by (i) and (3). This term vanishes by the definition of 7,,. Thus we have proven 
(i) and (ii) for m+ 1 and thus for all m = 1,2,3... To prove (2.41c) we write 


(Vie), eke). = sill ee = 0 forj<m, 


and note that we have already shown this in the proof. 


Theorem 2.21 Let (2) and (p™) be the sequences of the conjugate gradient 
method. Define the space Vi := span{p°,...,p™}. Then we have the following 
equivalent characterizations of Vin: 


Ve = span Vi). Viet (2.42a) 
= span{p®, K*Kp’,...,(K*K)™p"} (2.42b) 
form =0,1,... . Furthermore, x™ is the minimum of f(x) = ||Ka — y||}- on 


Vm-_-1 for every m > 1. 
Proof: Let Vn, = span Ve hs VE (x m)\. Then Vo = Vo. Assume 
that we have already shown that Vn = Viz. Since p™+! = 1Vf(z peas + 


Ymp™ we also have that Vin4i = Vint: ee define the space Vin 
span {p°,...,(K*K)™p°}. Then Vo = Vo and V; = V;. Assume that we have 
already shown that Ae = V; for all 7 =0,...,m. a we conclude that 


grert K* (ha _ y) ae mp 
= K*(Ka™ — y)—tyK*Kp™ + mp” 
_ p™ _ Ym—1p™ 1 _ tmi* Kp™ 4 mp € Voie (B1) 


On the other hand, from 
(K*K)™*+'p? = (K*K) [(K*K)™p"] © (K*K)(Vm) 


and K*Kp) € Vj41 C Vin4i by (B1) for j = 0,...,m, we conclude also that 
Vin+1 = Vin41: 
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Now every x” lies in V,,-1. This is certainly true for m = 1, and if it 
holds for m then it holds also for m+ 1 since at! = 2™ — typ™ € Vm. 
x™ is the minimum of f on V,,-1 if and only if (Ka™ — y, Kz)y = 0 for all 
z € Vm-1. By (2.42a), this is the case if and only if (Vf(x™), Vf(2)) , = 0 
for all 7 =0,...,m-—1. This holds by the preceding theorem. 


Lemma 2.22 (a) The polynomial Qm, defined by Qm(t) = 1—tPm_i(t) with 
Pm_—1 from (2.43), minimizes the functional 


H(Q) = ||QKK*)ylly on {QE Pm : Q(0) = 1} 


and satisfies 
H(Qm) = ||Kx™ — yl}. 
(b) Fork # , the following orthogonality relation holds: 


Co 


(Qe, Qe) = S>u? Qe(H2) QW?) [(y,ys)v |? = 0. (2.44) 


j=l 
Ify ¢ span{y,...,yn} for any N EN, then (-,-) defines an inner product on 
the space P of all polynomials. 
Proof: (a) Let Q € Pm be an arbitrary polynomial with Q(0) = 1. Set 
P(t) :-= (1 — Q(t))/t and x := P(K*K)K*y = —P(K*K)p® € Vi»-1. Then 
y—-Ke = y-KP(K*K)K*y = Q\UKkK*)y. 
Thus 
H(Q) = ||Kz—ylly = |Ke™—yll} = H(Qn). 
(b) Let k 4 @. From the identity 


1 foe) 
5 VE (a*) = K*(Kx*—y) = —S > wjQ()(y, yy a, 


jet 


we conclude that 
0 = F(VF(e), Vile"), = (QQ), 


The properties of the inner product are obvious, except perhaps the definiteness. 
If (Qk, Qk) = 0, then Qx(u5) (y,y;)y vanishes for all 7 ¢ N. The assumption 
on y implies that (y,y;)y # 0 for infinitely many j7. But then the polynomial 
Q, has infinitely many zeros Ls. This implies Q; = 0, which ends the proof. 


The following lemma is needed for the proof of Theorem 2.24. 


Lemma Let 0 < m < m(0) where m(0) satisfies (2.46) and define the space 
X°? = (K*K)?/? for anyo > 0, equipped with the norm ||z|| xo:=||(K*.K)~?/22|| x. 
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Let y° ¢ spanfy,...,yn} for any N € N with ||y° — Ka*|ly < 6, and let 
x* € X° for some o > 0, and ||a*||xe < E. Then 


E 
|204,ca) |r? 


|Ka™? — yF|ly < 5 + dee 


where Q>, denotes the polynomial Qm for y = y°. 


Before we prove this lemma, we recall some properties of orthogonal functions 
(see [254]). 


As we saw in Lemma 2.22, the polynomials Q,, are orthogonal with respect to 
the inner product (-,-). Therefore, the zeros Ajm, j = 1,...,m, of Qm are all 
real and positive and lie in the interval (0, || K Ilz¢ xy)): By their normalization, 
Qm must have the form 


Furthermore, the zeros of two subsequent polynomials interlace; that is, 
O< Aim < A1,m-1 < A2,m < A2,m—1 qr <s Am—1,m—-1 < Am,m < IK ize.) : 


Finally, from the factorization of Q,,, we see that 


d i 

Omit) = —Qm(t) y =e and 

Pe m 1 2 m 1 

We Qm(t) = QE) (3 Maa ;) » (Aden: = =| 

j=l jal 
= Qnlt) > 
ue jal (Aa a _ t)(Aem _ t) 
JFL 


For 0 < t < Aim, we conclude that £ Qn(t) < 0, £ Qm(t) > 0, and 0 < 
Qm(t) < 1. 

For the proof of the lemma and the following theorem, it is convenient to 
introduce two orthogonal projections. For any ¢ > 0, we denote by L, : X > X 
and M.: Y — Y the orthogonal projections 


Lez = S- (2,0n)xIn, zEX, 
Ma SE 

Mez := S (Z,Un)Yy Yn, ZEY 
p2<e 


where {Un,2n, Yn : nm © J} denotes a singular system of K. 
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The following estimates are easily checked: 


1 
|M-Kally < Vvel|Leal|x and || —L.)z\|x < gi eek 


for alla e X. 


Proof of the lemma: Let j,m be the zeros of Q°,. We suppress the dependence 
on 6. The orthogonality relation (2.44) implies that Q°, is orthogonal to the 
polynomial t +> Q°, (t)/(Aim — t) of degree m — 1; that is, 


yOu 2) Salted |,un)v[> = 0. 
pe 


ym 
n=1 


This implies that 


5 ( ie 5 2 
yo, a [(y°, Yn) ¥| 
8 cel ym — bn 
= > uy ian oe > andy? 
M2 >A1m Bn oer 
> SS O27 any - 
2 >Arm 


From this, we see that 


ye—Ko™ 2 = [| So + SD | 0862)? 16% ay” 


H2<Aim U2 >Aiym 


< oy Q4 ( os I( 5 2 
= Cae sass ip y Yn) | 
2 <A im ‘ 1m Hn , 
=m (U7)? 
= IMs m Pm(KK*)y If, 
where we have set 
t Ay 
&.,(t) = O°, /14+ —— = 0 ,/ —“_. 
() = GO 1+ ~—— = OO 


Therefore, 


lly? — Kay < My, Pm (KK*)(y? — y*)lly + Maan Pm (KK) Ka" lly « 
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We estimate both terms on the right-hand side separately: 


2 
Mar m Pm(KK*)(y? — aI = S Om(ur)? |? — y* yn) ¥| 
2 <1 m 
< 21),,0 _ 9 ,* || 2 
< jo fe) ly? — y" lly, 
* — * 2 
Mr. mPm(KK*) Kal = So [®m(u2)?Het??] un??|(2", en) x| 
B2<Aim 
< l+o 2 * 1) 2 a 
S jgmax [O00] le" 


The proof is finished provided we can show that 0 < ©,,(¢) < 1 and 


o+l1 
l+oa 
fo, a = eon forall Pe is 
me Abe Q5.(0)1 _- 


The first assertion follows from ®,,(0) = 1, ®m(Aim) = 0, and 


d Atm 


2 _ d Aiym 
Gimli] = 20mlt) ZOm(t) ys 


(Aim — t)? 


1 1 
®,,(t)? eer = ih 


+ Qm(t)? 


Now we set #(t) := ¢'77®,(t)?. Then 4(0) = (Aim) = 0. Let # € (0,A1,m) be 
the maximum of w in this interval. Then w’(t) = 0, and thus by differentiation 


» r 1, d ‘ 
(o + 1)?7S,,(¢7 + et [om(é)’] = 0; 
that is, 
Pa (eee 1 1 ae 1 
go+1 = ¢|2y° ‘ | See : 
A rim—t Mm —t St dim —t 
awl x [hfs 
> t — = ?t|—Q°,(0 
> Las = Rho) 


This implies that ¢ < (o + 1)/|4Q°,(0)|. With w(t) < #?*? for all t € [0, Arm, 
the assertion follows. 


Theorem 2.24 Assume that y* and y° do not belong to the linear span of 
finitely many y;. Let the sequence x'™)5 be constructed by the conjugate 
gradient method with stopping rule (2.46) for fixed parameter r > 1. Let 
a* = (K*K)°/?z € X° for some a > 0 and z € X. Then there exists c > 0 with 


aX — a OF) < cgrFD BUD, (2.47) 
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where E = |\z||x. 


Proof: Similar to the analysis of Landweber’s method, we estimate the error 

by the sum of two terms: the first converges to zero as 6 + 0 independently 

of m, and the second term tends to infinity as m — oo. The role of the norm 
; 1/2 

\|Rallccy,x) here is played by | 4Q°, (0)| ie 

First, let 6 and m := m(0) be fixed. Set for abbreviation 


d a5 
3 0)}. 
q = |FOr,(0)| 
Choose 0 < € < 1/¢ < Aim. With 
& := a — Q°(K*K)a* = Po 


et WO) 
we conclude that 


* 


Be Ho) et |= Ee =a" \lx 
e(2* — 4x + ||Le(@—-2™*)||x 


IIz* — 2° || x 


L 
L 


IN IA 


+ Zell ~ Me)(y* ~ Ka) hy 


S ||LeQi,(K*K)a*||x + ||LePm—i(K*K)K*(y* — y°)|Lx 


1 5 ae 6 
a * | Ka™ 
l+r 
< a /2qyd ) —__ §, 
= Peak Q(t] + 6 pax |vVEPr (1) r ve : 


From € < Aim and 0 < Q?,(t) < 1 for 0 < t < Aim, we conclude that 
= VO? GS ee" doris xs. 


Furthermore, 


1— Q(t) 
t 


——— 
=-£%,(s) 


0< tPF). = [1-030] 
ee 


for some s € [0,<]. Thus we have proven the basic estimate 


1 
hee atte Set Se gs tos O<e<7. (B2) 


Ve 


e € (0,1/q) is a free parameter in this expression. We minimize the right-hand 
side with respect to ¢. This gives 


eotv/2 _ raed 
‘ oO 


) 

E 
Since we do not know if ¢, lies in the interval (0, 1/ q), we have to distinguish 
between two cases. 
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Case I: €, <1/q. Then 


1/(o+1) 1/(o-+1) 
fife ee tO a 
JEex r+i1 é 


I|x* = £9 || x 2 c6o/(et)) BE M(et+)) 


and thus 


with some constant c > 0, which depends only on o and r. This case is finished. 


Case II: €, > 1/q. In this case, we substitute « = 1/q in (B2) and conclude 
that 


* — 2 || x 


IA 


Eq??? + a@ <q Bel? A (p+ 2)y/@e 


1 /(o+1) 
(= ) 57/41) Bl/(o+t) a (r +2) /q6. 


oO 


Ila 


It remains to estimate the quantity q = dm = | Q, (5) (0) I Until now, we have 
not used the stopping rule. We will now use this rule to prove the estimate 


E\2/(¢+1) 
im < e(F) (B3) 
é6 
for some c > 0, which depends only on o and r. Analogously to qm, we define 
dm—1 = | Q?, (6) -1(0)]. By the previous lemma, we already know that 
5 5 eymld)-18]) <5 1 (gaia: 
rd < |ly°— Ka ly < 6+ (+e) (o+1)/2 ? 
m-1 


that is, 


(o+1)/2 (1+ g)(i+e)/2 E 
dm-1 < . 
r—1 6 
We have to prove such an estimate for m instead of m — 1. 


Choose T > 1 and p* € (0,1) with 


(B4) 


T p 
—— d T <2: 
rots" an ie 


If dm < dm—1/p*, then we are finished by (B4). Therefore, we assume that 
Gm > Im—1/p*. From Ajm > Aj—1,m-1 for all 7 = 2,...,m, we conclude that 


m 1 m-1 1 
oi 

dn = “%, — < + + dm-1- 

(o| = De a ee ie 
This implies that 

r p ‘ 
Qm-1 S Pam S 5 + P'dm-13 
1,m 
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that is, 
p 
Qm-1 S 

: (hp) Ain 
Finally, we need 

m-1 * 

fe a a = a ee, 
A2,m Mijm1 rm Ajjm—1 1— p* Atm 


Now we set €:= T Aim. Then 


Define the polynomial ¢ € P»,_1 by 


a = ao(1- EY" = T(-xb). 


For t <¢ and j > 2, we note that 


CS 4 tS of re ee 


2 
75m A2,m 


that is, 
\o(t)| < 1 forallO<t<e. 


For t > €, we conclude that 
t | _ t=Aim . _E 


1 7 
Aim Aim - Aim 


that is, 
1 6 


Since ¢(0) = 1, we can apply Lemma 2.22. Using the projector M., we conclude 
that 


ro < lly? — Kad lly < oR )ylly 
< |MG(KK*)ySlly + | ~ Meo Ky ly 
< [Mey —y lly + [Meytlly + ao [OLUCKy' ly 
=llyS—Kam 5 ly 
a ye oe ae (gyn a, 
T-1 T-1 
since ||M.y*|ly = ||M-K2*|ly < @+)/?\|2*||xo. Defining c := r— 74, we 


conclude that ¢% < (T rs lie and thus finally 


E 2/(o+1) 
dm < vi al dm—1 < T cé + dm—-1 - 


Combining this with (B4) proves (B3) and ends the proof. 
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